World Models vs. Language Models

Why scaling fluency is not the same as building intelligence

Jan 27, 2026

The past two years of AI strategy have been dominated by a single instinct: scale the language model and let intelligence emerge. Bigger models, longer context windows, more parameters, more agents. It’s been a recurring theme in many of my recent articles.

All cynicism aside, that instinct has produced real value. It also produced dangerous illusions.

What we are seeing now is not a failure of AI, but a fork in philosophy about what intelligence actually is and how it should be built. On one side sits OpenAI and Anthropic, betting that reasoning, planning, and autonomy emerge through scale, tooling, and orchestration around LLMs. On the other sits Yann LeCun, arguing that no amount of language will substitute for systems that understand how the world behaves.

This divide matters in product development because it maps directly onto the Productivity J-Curve. One path front-loads visible gains and back-loads fragility. The other looks slower at first but compounds into durable advantage.

Design by Warren Smith using Chat GPT

Two Competing Visions of AI Progress

The OpenAI and Anthropic Roadmap: Intelligence Through Language Orchestration

OpenAI and Anthropic share a broadly similar strategy.

Language models are treated as the cognitive core. Reasoning improves through scale. Reliability improves through guardrails. Capability expands through tools, retrieval, memory, and agents. The system grows outward from language.

In this worldview:

LLMs become the planner
Tools become extensions of thought
Agents become the execution layer
Safety emerges from alignment techniques and oversight

This approach is pragmatic and commercially effective. It excels at tasks involving text, code, synthesis, and interaction. It is also why these systems feel magical in demos and immediately useful in knowledge work.

But this roadmap assumes something critical: that intelligence can be approximated through better prediction of language and better orchestration of downstream tools.

That assumption is exactly what LeCun appears to reject.

Yann LeCun’s View: Intelligence Requires World Models

Yann LeCun is not a peripheral critic. He is one of the architects of modern AI. His work on convolutional neural networks made vision viable, and as the founder of Meta’s FAIR lab and a long-time academic researcher, he has watched multiple paradigms crest and flatten.

LeCun has also been a frequent presence at MIT over the past few years, speaking regularly about the limits of prevailing AI approaches. I have attended several of those talks. What stands out in hindsight is not that his position has shifted, but that it has remained remarkably consistent while the industry around him chased each new breakthrough.

His argument is not that LLMs are useless. He is explicit that they are extraordinarily valuable. His claim is narrower and more structural: language models cannot become intelligent systems by scaling alone.

Language models operate in a symbolic space. They learn correlations between tokens. They do not experience physics, causality, time, or consequence. I like his ongoing joke that LLMs don’t truly understand gravity. No matter how large they become, they are predicting what comes next in text, not what happens next in reality.

This is why LeCun pushes world models.

World models learn abstract representations of how the world behaves from observation, not description. They predict outcomes in latent space. They ignore noise and focus on what is structurally predictable. This is how humans learn common sense. It is how animals navigate environments. And it is what current LLM-centric systems fundamentally lack.

In LeCun’s framing, LLMs may orchestrate systems, but they cannot be the system of understanding.

Where Scaling Language Stalls

The industry is now encountering the limits LeCun warned about.

LLMs are fluent but brittle. They explain confidently even when wrong. They struggle with causality, forecasting, and counterfactual reasoning. They cannot reliably anticipate the downstream effects of actions in dynamic environments.

This is not a temporary shortcoming. It is architectural.

Scaling improves pattern completion. It does not create grounding. The Moravec Paradox still applies: what humans find intuitive about the physical and operational world is precisely what machines struggle with most.

This is why autonomous agents fail in the wild. This is why hallucinations persist despite better alignment. This is why governance costs explode as systems scale.

Language-first systems climb the front side of the J-Curve quickly. Then the curve bends downward.

World Models Already Exist Quietly in Industry

The irony is that world models are not speculative. They already exist, just not under that name.

They show up in many of the systems I have built recently:

Forecasting systems modeling demand, risk, or failure rates
Control systems managing industrial processes
Digital twins simulating factories, supply chains, or networks
Rules engines encoding regulatory and operational constraints
Classical ML models estimating probability and confidence

These systems do not speak. They predict.

In mature organizations, these models sit beneath dashboards, workflows, and alerts. They form the substrate of decision-making. What has been missing is a usable interface for humans.

IMO this is where LLMs belong.

The Productive Architecture: Models First, Language Second

The most effective systems I have worked on in the past year followed a consistent pattern:

World modeling comes first: Deterministic rules, classical ML, or forecasting models establish what is likely to happen, how confident we are, and how wrong we could be.
Language sits on top: LLMs interpret outputs, explain implications, synthesize context, and provide conversational access to complex systems.
Retrieval grounds explanation: When decisions matter, LLMs cite policies, data, and history rather than inventing rationale.
Decision rights are explicit: The system knows when it informs, when it recommends, and when it acts.

This hybrid approach sacrifices some early flash for long-term trust. It flattens the J-Curve dip.

The Strategic Mistake to Avoid

The most expensive error organizations are making right now is treating LLMs as substitutes for world models rather than interfaces to them.

OpenAI and Anthropic are pushing the frontier of what language systems can do, and that work matters. But language alone cannot ground intelligence. Without models of reality, autonomy becomes theater.

LeCun’s warning is not anti-LLM. It is pro-structure.

The future belongs to systems that understand before they speak.

TLDR

OpenAI and Anthropic are scaling language systems toward reasoning through orchestration and agents. Yann LeCun argues intelligence requires world models that predict reality, not text. The strongest architectures pair deterministic or classical models for signal generation with LLMs for interpretation. This alignment reduces the Productivity J-Curve dip and produces durable trust. Language should sit on top of truth, not replace it.

Attribution and Inspiration

Yann LeCun’s new venture is a contrarian bet against large language models, by Caiwei Chenarchive, MIT Technology Review, January 22, 2026

Warren’s Substack

Discussion about this post

Ready for more?