Forget LLMs. World Models Are AI’s Next Leap

Why tech’s biggest names are betting billions on “world models” over LLMs. created by Gemini Every AI product you have used this year, ChatGPT, Claude, Gemini, runs on the same basic trick. It reads a mountain of text and learns to guess the next word. That guessing game turned out to be shockingly powerful. It writes code, drafts emails, argues philosophy. But ask it to predict what happens when you knock a glass off a table, and it is just pattern matching against text it once read about gravity. It has never seen a glass fall. It has never seen anything. That gap is why some of the most respected names in AI have quietly stopped chasing bigger language models and started building something else entirely: world models. And the amount of money now behind this bet should tell you it is not a fringe idea anymore. The problem nobody wanted to say out loud Large language models are, at their core, next-token predictors. They are extraordinary at manipulating language, but language is a description of the world, not the world itself. A model trained only on text can tell you that ice melts in heat because it has read that sentence a million times. It has no internal sense of temperature, mass, or motion. It cannot simulate a scenario it has never read a description of, because it never built a model of reality to simulate with. This is the argument Yann LeCun, the Turing Award winner and Meta’s former chief AI scientist, has been making for years, often to the irritation of his own employer. LeCun argues that large language models, which predict the next word in a sequence, are architecturally insufficient for real-world intelligence because they learn no model of how physical events cause one another. He left Meta in November 2025 after roughly twelve years running its research lab, reportedly over disagreements about which architecture the company’s future should be built on. He did not leave quietly. He started a new company to prove his point. LLMs versus world models, side by side Neither approach replaces the other yet. Most researchers expect the two to eventually work together, language for communication, world models for grounding that language in physical reality. The billion-dollar bet on a different kind of AI In March 2026, LeCun’s new startup, AMI Labs, announced it had raised just over a billion dollars in seed funding, at a valuation of three and a half billion dollars. That is not just a large check for a company that did not exist a year earlier. It is Europe’s largest-ever seed funding round, for a startup built specifically around the idea that world models are the real path forward, not the large language models funded by Silicon Valley’s biggest names. Here is how that round compares to the rest of the field racing after the same idea. The investor list is the part that should make you pay attention. Backers include Nvidia, Samsung, Jeff Bezos, Mark Cuban, former Google CEO Eric Schmidt, and web inventor Tim Berners-Lee. These are not people who write checks on vibes. When Nvidia, whose entire business depends on correctly predicting where AI compute demand is headed, backs a company betting against the dominant LLM approach, that is a signal worth noticing. So what is AMI Labs actually building? Its approach centers on the Joint Embedding Predictive Architecture, or JEPA, which trains AI systems to predict abstract representations of future states from observations, aiming for causal reasoning and reliable planning instead of next-word guessing. In plain terms, instead of learning from sentences about the world, the system learns from raw observation, watching things happen and building an internal sense of cause and effect, the same way a toddler learns that a dropped toy falls before anyone teaches them the word gravity. The company is not promising fast results either. Its CEO has been candid that turning this research into deployable products will likely take about a year, with early applications aimed at robotics, healthcare, and industrial automation rather than another chatbot. LeCun is not the only one chasing this World models were a niche research interest until very recently. Now they are one of the most competitive corners of AI. Fei-Fei Li, the Stanford researcher often credited with kickstarting the deep learning boom through ImageNet, founded World Labs specifically to build these systems. Her company’s first product, Marble, generates physically coherent 3D worlds rather than flat images, and World Labs was reportedly in talks for fresh funding at a five billion dollar valuation shortly after AMI Labs made headlines. Google DeepMind has been building in the same direction with Genie, a system designed to generate interactive environments a model can act inside of, effectively giving AI something closer to a training ground than a library. Smaller players like SpAItial and Decart are also racing to stake a claim in the space, and video generation companies are increasingly describing their newest models in world-model terms rather than pure content-generation terms. The common thread across all of them is a rejection of the idea that reading enough text can substitute for experiencing how things move, break, and interact. What this looks like outside a research paper The theory is easier to grasp with real situations attached to it. A warehouse robot picks the wrong box. Today’s robots trained mostly on narrow, repetitive demonstrations fail the moment a box is rotated slightly or the lighting changes, because they memorized examples instead of learning how objects actually behave in space. A robot running on a world model would instead reason about the box the way you do, predicting that it can still be gripped even from an unfamiliar angle, because it has an internal sense of shape and weight rather than a lookup table of past examples. A self-driving car sees a plastic bag blow across the road. A system without a real model of physics has to guess from pixel patterns whether that is a hazard. A world model can simulate forward: light object, wind-blown, no real mass, safe to continue, versus a similarly shaped object that is heavier and worth braking for. That kind of split-second physical reasoning is exactly what LeCun has pointed to as the gap autoregressive systems cannot close. A doctor uses an AI scribe that also reasons about physiology. AMI Labs has close ties to Nabla, a medical AI company whose CEO now runs AMI. The bet there is straightforward: language models are prone to confidently inventing facts, which is a serious problem in a clinical setting. A system with an actual grounded model of how the human body behaves under different conditions is a different risk profile than a model that is, structurally, just predicting plausible-sounding words. A game or training simulation generates itself as you move through it. DeepMind’s Genie work points toward AI that can generate an entire interactive 3D environment on the fly, reacting to your actions the way a real environment would, rather than replaying pre-built assets. That has obvious uses in gaming, but the same underlying capability is what lets robots practice in a realistic simulated world before ever touching a real object. Why this actually matters, not just for researchers If you build products, write about AI, or just try to make sense of where this industry is heading, world models matter for a few concrete reasons beyond the examples above. Robotics has been stuck for years on a hard problem. A robot arm can be trained to pick up a specific object in a specific lab, but it falls apart the moment something in its environment changes slightly. That fragility comes directly from not having a working model of physical reality. A system that actually understands how objects behave, rather than one that has memorized a narrow set of examples, is what would let robots generalize the way humans do without a fall in performance every time the lighting changes or the object shifts an inch. There is also a quieter economic argument here. The AI industry has spent the last few years scaling language models by throwing more data and compute at the same basic architecture. That approach is running into diminishing returns and enormous energy costs. If a different architecture can achieve better real-world reasoning with a fraction of the resources, it changes the entire cost equation for the industry, not just the capability ceiling. It is not proven yet, and that is worth saying clearly None of this means large language models are obsolete or that world models are guaranteed to work at scale. This is genuinely early-stage research, not a shipped product replacing your chatbot next quarter. A formal proof published in May 2026 showed that LeCun’s JEPA architecture can, under specific mathematical conditions, recover the true structure behind raw observations. But a companion benchmark released the same week found that current world models are still brittle, breaking down under minor visual changes in simulated environments. In other words, the theory is getting sharper, but the actual systems are still fragile in practice. This is a research bet with a real chance of paying off, not a settled fact. That combination, serious theoretical progress alongside serious practical limitations, is exactly why this is worth understanding now rather than after it either fails quietly or becomes the next thing everyone suddenly claims they saw coming. The takeaway Language models taught AI to talk. World models are trying to teach it to understand what it is talking about. Whether JEPA specifically wins out, or DeepMind’s approach, or something from World Labs, the direction of travel is now backed by more research credibility and more capital than it has ever had. If you have spent the last two years thinking about AI purely in terms of prompts and chatbots, this is the shift that is quietly happening underneath that entire conversation. The next time someone tells you AI has plateaued because chatbots feel roughly the same as they did last year, point them here. The frontier moved. It just moved somewhere most people have not been looking. Forget LLMs. World Models Are AI’s Next Leap was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Read Original Article →

Source

https://pub.towardsai.net/forget-llms-world-models-are-ais-next-leap-fd3980cd7166?source=rss----98111c9905da---4