Leopold Aschenbrenner — 2027 AGI, China/US super-intelligence race, & the return of history

Dwarkesh Podcast 4h32 6 min #68
Leopold Aschenbrenner — 2027 AGI, China/US super-intelligence race, & the return of history
Watch on YouTube

Summary

  • Leopold Aschenbrenner, a former OpenAI superalignment team member and Columbia valedictorian, argues that AI progress is on a trajectory toward superintelligence by 207–2028, driven by exponential growth in compute clusters, and that this will trigger a geopolitical crisis comparable to the Manhattan Project era — with the US, China, and authoritarian states racing to build trillion-dollar data centers, steal AI secrets, and automate military and economic power.

The trillion-dollar cluster and unhobbling

  • AI development is fundamentally an industrial process: each new model generation requires building massive new compute clusters, power plants, and eventually semiconductor fabs — not just writing better code.
    • Training compute for the largest AI systems has grown by roughly half an order of magnitude (0.5 OOMs) per year for nearly a decade.
    • GPT-4 (2022) used 25,000 A100s ($500M cluster, ~10 MW). By 2024, clusters are ~100 MW with 100,000 H100 equivalents costing billions. By 2026, a gigawatt-scale cluster (Hoover Dam output) costing tens of billions with a million H100 equivalents. By 2028, 10 GW (more than most US states), 10 million H100 equivalents, hundreds of billions of dollars. By 2030, a trillion-dollar cluster at 100 GW — over 20% of US electricity production.
    • Companies like OpenAI and Microsoft are already planning $100B+ clusters; AMD forecasts a $400B AI accelerator market by 2027.
  • The economic justification: if AI can automate cognitive labor at scale, $100B+ annual revenue is plausible — e.g., selling a $100/month AI add-on to a third of Microsoft’s 300M Office subscribers yields $100B.
  • Unobbling: Current models like GPT-4 are “hobbled” — they’re smart but limited to chatbot-style interactions. The key unlock is making them into agents that can do long-horizon tasks, use computers, and work autonomously like remote employees.
    • By 2025–2026, models will surpass most college graduates in capability. By 2027–2028, they’ll match the smartest experts and function as “drop-in remote workers” — attending Zoom calls, using Slack, writing and iterating on code, running tests, and completing projects with minimal human oversight.
    • Intermediate models could have been integrated into businesses, but it would require painful workflow changes (“schlep”). Overpowered agentic models make adoption trivial — you just don’t need the human worker anymore.
  • Test-time compute overhang: GPT-4 can “think” for a few hundred tokens (equivalent to ~3 minutes of human thought). If models could think coherently for millions of tokens (months of working time), they’d gain enormous problem-solving ability — roughly equivalent to a model 3.5 OOMs larger.
    • Unlocking this requires learning “System 2” thinking tokens: error correction (“I made a mistake, let me reconsider”), planning (“here’s my plan of attack”), self-critique (“let me review my draft”).
    • This is the “unhobbling path” to agents — as opposed to the “scaling path” that just improves reliability (more “nines”).
  • Pre-training vs. self-play: Pre-training on internet text gives models rich world representations, but it’s sample-inefficient (like passively listening to a lecture). Real learning requires active engagement — trying problems, failing, discussing, and distilling insights.
    • Models are entering a regime where they can learn from self-play, synthetic data, and RL — like a student who has learned enough basics to start teaching themselves.
    • This is analogous to the difference between GPT-2 (preschooler) and GPT-4 (smart high schooler); scaling alone will produce another such jump by 2027–2028.

AI 2028: The return of history

  • The intelligence explosion: Once AI can automate AI research itself, progress could accelerate dramatically — 100 million automated AI researchers running on inference clusters could compress a decade of ML progress into a year.
    • This then cascades into robotics, biology, materials science, and other fields — potentially compressing a century of technological progress into less than a decade.
    • A lead of even a few years could be as decisive as the US technological advantage in the first Gulf War (100:1 kill ratio from 20–30 years of lead in sensors, GPS, stealth, precision weapons).
    • Superintelligence applied to military technology could undermine nuclear deterrence entirely — e.g., millions of mosquito-sized drones finding and destroying nuclear submarines and mobile launchers.
  • Historical parallels: The post–Cold War period of peace and liberal democratic dominance is historically abnormal. The norm is intense great-power competition — World War II saw 50% of US GDP go to war production, borrowing over 60% of GDP. The Seven Years’ War killed 20–30% of Prussia; the Thirty Years’ War killed up to 50% of parts of Germany.
    • People in the US have grown complacent; most don’t yet “feel” the trajectory. But exponential trends will become undeniable — like COVID in February 2020, when most of the world didn’t yet grasp what was coming.
  • Authoritarian implications of superintelligence: A regime with superintelligence could achieve perfect surveillance, perfect lie detection, and perfectly loyal security forces — eliminating dissent, coups, and reformers like Gorbachev. Truth could be permanently locked in by the party, with no pluralistic evolution of ideas.

Espionage and American AI superiority

  • Why clusters must be in the US (or allied democracies): Building AGI clusters in authoritarian states like the UAE creates irreversible security risks.
    • They could steal the model weights (a literal copy of the AGI, like stealing the atomic bomb design) or seize the compute outright.
    • Even a 25% compute share gives authoritarian states enormous leverage — 33 million superintelligent AI researchers could design novel WMDs. A 3:1 compute ratio is dangerously close.
    • Middle Eastern states have financial capital but lack leading AI labs, talent, and hardware (they’re export-controlled from receiving Nvidia chips). Their “seat at the AGI table” is being bought with money, not earned with capability.
  • Two paths to powering US clusters:
    • Natural gas: The US has ample natural gas; production has nearly doubled in a decade. A 10 GW cluster is a few percent of US natural gas output. 100 GW is doable with continued expansion. This conflicts with climate commitments made by Microsoft, Amazon, etc.
    • Green energy megaprojects: Solar, batteries, small modular reactors (SMRs), geothermal — but requires massive deregulation: FERC reform, NEPA exemptions, streamlined permitting, rights-of-way for transmission lines. Currently, hooking a solar installation to the grid can take years due to state-level regulations.
    • Ideally both paths are pursued; at least one is necessary.
  • The “they’ll go to China anyway” argument: Some claim that if the US doesn’t work with the UAE, they’ll partner with China instead. Leopold is skeptical:
    • The UAE can’t translate money into AI progress on its own — it lacks labs, talent, and hardware.
    • There are reports that OpenAI leadership once planned to fund AGI by starting a bidding war between the US, China, and Russia — effectively selling AGI to authoritarian governments.
    • Benefit-sharing (offering last-gen models for civilian use) is reasonable, but giving authoritarian states a seat at the AGI development table is a profound strategic error.
  • Secrecy and algorithmic lead: The US has a significant lead in algorithmic progress (~0.5 OOMs/year). If secrets are protected, this compounds — a few years of lead could mean a 10–100x effective compute advantage.
    • Weights theft: Stealing model weights is trivially easy — an employee at a US lab reportedly copied critical AI code to Apple Notes and exported it as a PDF, bypassing monitoring. Google has the best security (enterprise-grade); other labs have startup-level security.
    • Algorithmic secrets: The fundamental approaches (next-token pre-training, scaling laws, transformers, MoE) were all published openly until recently, which is why China can build decent models from Llama and other open-source work. If the next paradigm (e.g., self-play RL to get past the data wall) is kept secret, China could be stuck — like Nazi Germany going down the heavy water path instead of graphite for nuclear reactors.
    • Tacit knowledge: Large-scale engineering for training runs involves hard-won tacit knowledge, but China can likely figure this out. The critical thing is protecting the ideas — the next paradigm — not just the engineering details.
    • Why a 1–2 year lead matters enormously: At the current pace, three years ago models couldn’t solve competition-level math problems; now they can. With a billion superintelligent researchers accelerating R&D, a year of lead could mean the difference between human-level and vastly superhuman AI — and decades of technological advantage.

Geopolitical implications

  • The stakes: What’s at stake is not just cool products but whether liberal democracy survives, whether the CCP survives, and what the world order looks like for the next century.
    • The CCP will eventually recognize superintelligence as decisive for national power and mount an all-out espionage and industrial effort — billions of dollars, thousands of people, full Ministry of State Security involvement.
    • China has enormous latent industrial capacity (they added as much power in the last decade as the entire US grid) and are already producing 7-nanometer chips despite export controls.
  • The danger of a tight race: If the US and China are neck-and-neck (e.g., 3-month lead), the situation is incredibly dangerous — both sides rush, throw caution to the wind, and destabilizing new WMDs emerge every few weeks, making deterrence volatile.
    • A comfortable lead (6+ months to 2 years) gives the US “wiggle room” to dedicate compute to alignment, slow down if needed, and avoid catastrophic mistakes.
  • Why most people aren’t talking about this: Being “in the trenches” of AI development gives a myopic view — researchers focus on the next model, the next benchmark, the next data problem. Zooming out just a few years reveals the exponential trajectory. Most people outside SF don’t yet “feel” it, just as most of the world didn’t grasp COVID until March 2020.
    • Once AGI becomes undeniable, societal reaction will be radical and fast — like Congress spending over 10% of GDP on COVID within weeks.
Back to Dwarkesh Podcast