Fei-Fei Li says language models are extremely limited. This @GoogleDeepMind paper makes almost the same point, just in the world of video. The models are just very advanced pattern matchers. They can recreate what looks like reality because they’ve seen so much data, but they don’t know why the world works the way it does. The models can generate clips that look stunningly real, but when you test whether they actually follow basic physics, they fall apart. The Physics-IQ benchmark shows that visual polish and true understanding are two completely different things. Here, the authors build Physics-IQ, a real-video benchmark spanning solid mechanics, fluids, optics, thermodynamics, and magnetism Each test shows the start of an event, then asks a model to continue the next seconds. They compare the prediction to the real future using motion checks for where, when, and how much things move. Scores then roll into a single Physics-IQ number that caps at what 2 real takes agree on. Across popular models, even the strongest sits far below that cap, while multiframe versions usually beat image-to-video versions. Sora is hardest to tell apart from real videos, yet its physics score stays low, showing realism and physics are uncorrelated. Some cases work, like paint smearing or pouring liquid, but contact and cutting often fail. arxiv. org/abs/2501.09038
How People Are Using AI For Developing Games
Builders are vibe-coding full prototypes—multiplayer loops, chatty NPCs with memory, shaders—by prompting, letting AI write servers, generate art, and live-debug so solo devs (and kids) ship in hours, not weeks. In 2025, “world models” can spin text/images into interactive 3D spaces, hinting at a game-engine-as-model future. Users gush about speed and creativity; experts celebrate the trajectory but flag physics gaps, bias, and maintainability—so AI is the turbo for prototyping while human design/testing harden production.
🤖 ai summary based on 28 tweets
Insights from builder, researcher, investors, and domain experts. Opinions are the authors.
Can’t load? Read tweets in preview...
Interesting benchmark — having a variety of models play Werewolf together. Requires reasoning through the psychology of other players, including how they’ll reason through your psychology, recursively. Wonder how much fun a mixed human/AI game would be! https://t.co/0sgD5kzz1p
Vibe Minecraft: a multi-player, self-consistent, real-time world model that allows building anything and conjuring any objects. The function of tools and even the game mechanics itself can be programmed by natural language, such as "chrono-pickaxe: revert any block to a previous state in time" and "waterfalls turn into rainbow bridge when unicorns pass by". Players collectively define and manipulate a shared world. The neural sim takes as input a *multimodal* system prompt: game rules, asset pngs, a global map, and easter eggs. It periodically saves game states as a sequence of latent vectors that can be loaded back into context, optionally with interleaved "guidance texts" to allow easy editing. Each gamer has their own explicit stat json (health, inventory, 3D coordinate) as well as implicit "player vectors" that capture higher-order interaction history. Game admins can create a Minecraft multiverse because the latents are compatible from different servers. Each world can seamlessly cross with another to spawn new worlds in seconds. People can mix & match with their friends' or their own past states. "Rare vectors" can emerge as some players would inevitably wander into the bizarre, uncharted latent space of the world model. Those float matrices can be traded as NFTs. The wilders things you try, the more likely you'll mine rare vectors. Whoever ships Vibe Minecraft first will go down in history as altering the course of gaming forever.
This is game engine 2.0. Some day, all the complexity of UE5 will be absorbed by a data-driven blob of attention weights. Those weights take as input game controller commands and directly animate a spacetime chunk of pixels. Agrim and I were close friends and coauthors back at Stanford Vision Lab. So great to see him at the frontier of such cool research! Congrats!