“Solving Erdős Problems Doesn’t Make It AGI”

In January 2026, two figures sat side by side at the World Economic Forum in Davos: Demis Hassabis of DeepMind and Dario Amodei of Anthropic. The atmosphere was electric. This was an era in which large language models were solving mathematical olympiad problems, dominating humans at Go, and predicting protein structures. Voices from across the industry were declaring AGI just around the corner.

Hassabis said otherwise.

“Today’s systems are nowhere near [AGI]. It doesn’t matter how many Erdős problems they solve.” (Original quote: “Today’s systems are nowhere near [AGI]” — Fortune, January 23, 2026)

Erdős problems are among the most fiendishly difficult mathematical puzzles in the world. If AI can solve them, that is certainly impressive. But Hassabis was categorical: that is not the benchmark for AGI. Why?

This question is the starting point of this post. It isn’t only a conversation for AI researchers. It concerns every organization that adopts AI, builds AI teams, or ships AI products.


The Line Hassabis Drew: “Understanding” vs. “Pattern Matching”

The benchmark Hassabis proposed for AGI is not “solve Erdős problems.” His standard is higher. Multiple outlets reported the so-called “Einstein Test”: the ability to independently propose a genuinely novel physical theory — a system with real understanding, not statistical pattern reproduction.

So where do today’s LLMs stand? No matter how vast the training data or how sophisticated the simulated reasoning, whether that constitutes “understanding” or merely “statistical pattern reproduction” remains an open question. Hassabis himself, in the same Davos panel, said “one or two more breakthroughs are still needed.” His timeline: “5 to 10 years.”

The story did not end there.


A Timeline Shift in Four Months: The Intentional Use of Provocative Language

About four months after the Davos remarks — on May 26, 2026, in an interview with Axios shortly after Google I/O — Hassabis struck a different tone. He suggested “AGI will probably arrive within the next five years.” Some outlets summarized this as “AGI in 3 to 4 years,” and the statement spread quickly.

Yet in the same interview, Hassabis added: “some of the terms I used were a little bit provocative.” (Original: “some of the terms I used were a little bit provocative” — Axios, 2026-05-26)

The careful January statement and the upgraded May timeline. The same person, in the same year, speaking in different contexts. Which reflects the “real” Hassabis?

Probably both. That very tension illustrates exactly why expectation management must become a cultural practice.


The Scars a Hype Cycle Leaves on Organizations

AI is an industry unusually prone to rapid hype cycles. You don’t need to consult Gartner’s Hype Cycle to recall how many times “this time is different” has been proclaimed over the past decade: the deep learning revolution, the chatbot era, GPT-3, ChatGPT, agentic AI. With every cycle, organizations invest in excitement, collide with reality, feel disappointed, and wait for the next wave.

This pattern leaves three kinds of scars on organizations.

First, the erosion of trust capital. If you announce “AI will solve this problem in six months” and then fail, the next time you bring a genuinely meaningful AI proposal, people will stop listening. Once leadership credibility is chipped away, it is hard to restore.

Second, the distortion of how failure is defined. When you start from exaggerated expectations, even genuinely substantial progress gets stamped as “failure.” Teams carry a sense of defeat into the next project rather than legitimate pride in achievement. Burnout and attrition follow.

Third, loss of direction. Projects launched by riding hype are hard to steer. “Build AI that’s like AGI” tells your team nothing. What should it build? How should success be measured? When is “enough” enough? Without answers to these questions, teams drift.


What Hassabis Practices: A Philosophy of Roadmaps

DeepMind’s trajectory reveals how expectation management is actually practiced.

AlphaGo (2016) targeted a clearly defined problem: Go. AlphaFold (2020) solved a verifiable scientific problem: protein structure prediction. AGI? For Hassabis, it remains a future challenge that still requires “one or two more breakthroughs.”

The pattern in this roadmap is consistent. Each stage first defines a goal achievable with current capabilities, actually achieves it, then prepares for the next. Rather than the slogan “AGI in five years,” the question is: “What will we prove at this stage?”

This should not be read as a conservative posture. That Hassabis admitted in May 2026 that “some of the terms I used were a little bit provocative” also signals that he understands the value of ambitious language in raising expectations. The key word is “intentional.” He knows precisely where the certainty ends and the deliberate provocation begins.


What Expectation Management Culture Actually Means

In AI organizations, “expectation management” often sounds negative — misread as a defensive posture of “lower your expectations” or “preempt disappointment.” Hassabis’s approach is nothing of the sort.

Honest expectation management comes down to three things.

1. Use language that distinguishes current capability from future possibility. “This model can do X” and “If we go this direction, X will become possible” are different statements. Clearly separating these two for teammates, boards, and customers is the foundation of trust — just as Hassabis distinguished “today’s systems” from “systems after one or two more breakthroughs.”

2. Define success criteria measurably. “Improve operational efficiency through AI” cannot tell you whether you succeeded. “Reduce processing time for task X by 30% six months after AI adoption” has a clear answer. Just as AlphaFold defined protein structure prediction accuracy numerically against the previous state of the art, AI projects must do the same.

3. Build a culture where not knowing is sayable. That Hassabis could admit “some of my terms were a bit provocative” is itself evidence that such admissions raise rather than lower trust. If no one inside an organization can say “we don’t know yet,” “this is harder than expected,” or “we need to adjust the timeline,” people will only pass good news upward and hide the bad. Those hidden problems will eventually detonate all at once.


Breaking the Disillusionment Cycle

The most common failure pattern in AI adoption is simple: start with excitement → over-promise → hit the wall of reality → feel disappointed → “AI turns out to be nothing special” → suspend investment until the next hype wave.

Breaking this cycle requires an intermediate step. Start with excitement, but define an achievable first stage; actually achieve that stage; make the results visible; then move to the next.

This is not the same as “think small.” DeepMind started with AlphaGo, built AlphaFold, and is now moving toward AGI. The destination is large, but each stage is designed realistically.

Organizations that make this work share a common habit. When they begin a new AI initiative, they first ask: “What does success look like at this stage?” They check at six months whether that question can be answered. When things fall short, they record “why did reality diverge from expectation?” That record makes the next project’s expectations slightly more grounded.


Implications for Organizations: Turn Realism into a System

If Hassabis’s realism remains a personal virtue, it disappears when that person leaves. Turning expectation management into organizational culture requires systems.

At ThakiCloud, we encounter this challenge frequently. As a company building an AI platform, we feel the tension every day between explaining AI’s possibilities to customers and honestly describing its limits. Tilt too far one way and you over-promise; tilt too far the other and you miss the opportunity.

One practice we have been trying: at the start of a project, explicitly writing a list of “what we will not achieve in this stage” — not just what we intend to do, but what we explicitly will not do. Experience has taught us that the “will not” list is more effective at calibrating expectations to reality than the “will do” list.

That Hassabis can say “today’s systems are nowhere near AGI” is not because he is pessimistic about the future of AI. He is the person who reshaped the landscape of biology with AlphaFold. Stating clearly the limits of current systems is a way of setting direction toward the next breakthrough.

Realism is not abandoning the dream. It is measuring the distance to that dream precisely.


Conclusion: Honesty Is the Strongest Cultural Asset

In the AI field, the easiest thing to do is excite people. “AGI is coming in three years,” “this model changes everything,” “if you don’t get on board now, you’re too late.” Words like these attract investment, increase recruiting inquiries, and generate media attention.

The hardest thing is to say, inside that excitement: “But this is not yet possible right now.”

Hassabis said that hard thing at Davos in January 2026. Solving Erdős problems is not AGI. One or two more breakthroughs are still needed. And in May, he freely admitted he had intentionally used provocative language.

When these two postures work together, organizational trust accumulates. Stating limits clearly. Admitting when you have overstated. The slow accumulation of honesty is a far stronger cultural asset than the short-term attention won by a single piece of hype.

For those leading AI organizations, Hassabis’s approach offers a useful benchmark. What can current AI do? What can it not yet do? What is the next stage, and what breakthrough does it require? A culture that answers these three questions honestly is the foundation of a sustainable AI organization — whether or not AGI ever arrives.


Sources

  • Fortune (2026-01-23): “DeepMind’s Demis Hassabis, Anthropic’s Dario Amodei, Yann LeCun at AI Davos” — https://fortune.com/2026/01/23/deepmind-demis-hassabis-anthropic-dario-amodei-yann-lecun-ai-davos/
  • WEF Radio Davos Podcast (2026-01): “AI & AGI — Dario Amodei, Demis Hassabis” — https://www.weforum.org/podcasts/radio-davos/episodes/ai-agi-dario-amodei-demis-hassabis/
  • Axios (2026-05-26): “DeepMind CEO Demis Hassabis on AGI timeline” — https://www.axios.com/2026/05/26/deepmind-ceo-demis-hassabis
  • Sherwood News: “Google DeepMind’s Hassabis: AGI is 3 to 4 years away” — https://sherwood.news/tech/google-deepminds-hassabis-agi-is-3-to-4-years-away/
  • CryptoBriefing: “DeepMind Hassabis AGI Einstein Test” — https://cryptobriefing.com/deepmind-hassabis-agi-einstein-test/