Andrej Karpathy — “We’re summoning ghosts, not building animals”

Dwarkesh Podcast 2h26 4 min #104
Andrej Karpathy — “We’re summoning ghosts, not building animals”
Watch on YouTube

Summary

  • Andrej Karpathy, a veteran AI researcher and former Tesla Autopilot lead, argues that building truly capable AI agents will take roughly a decade—not a year or two—because current systems lack fundamental cognitive abilities like continual learning, robust multimodality, and real-world reasoning. He sees today’s LLMs as powerful but brittle “ghosts” trained by imitating internet data, not animals shaped by evolution, and believes progress will come from incremental engineering breakthroughs across algorithms, data, hardware, and training paradigms—not a single magic bullet.

Why agents will take a decade

  • Karpathy pushes back against hype calling this the “year of agents,” insisting it’s more accurately the decade of agents.
    • Current agents like Claude Code or Codex are impressive but fail on basic employee-like tasks: they can’t remember past interactions (no continual learning), struggle with non-text modalities, and lack reliable computer-use skills.
    • He estimates ~15 years of AI experience watching predictions fail; problems are tractable but hard, averaging out to a 10-year timeline.
    • Early attempts at agents (e.g., Atari RL, OpenAI’s Universe project) failed because they skipped foundational representation learning—today’s success comes from first building LLMs via pre-training, then adding agent capabilities on top.

We’re building ghosts, not animals

  • Karpathy rejects the idea that AI should mimic biological evolution or animal learning.
    • Animals are born with massive innate hardware (e.g., zebras run minutes after birth); their brains are pre-wired by evolution, not learned from scratch.
    • LLMs are “ghosts”: digital entities trained by imitating human-generated internet text, not through embodied experience or genetic encoding.
    • Evolution compresses learning algorithms into DNA (~3 GB), which then guide lifetime development—but we don’t know how to replicate that process, so we use pre-training as a “crappy evolution” to bootstrap intelligence.

Pre-training vs. in-context learning

  • Pre-training compresses 15 trillion tokens into a few billion parameters—a “hazy recollection” of the internet.
    • In contrast, in-context learning uses the KV cache (320 KB per token), acting like working memory with direct access to recent information.
    • This explains why models answer questions better when given source text in context: it’s loaded into active memory, not recalled from compressed weights.
    • In-context learning may internally run something like gradient descent, but it’s not explicit—it emerges from pattern completion over internet-scale data.

Missing pieces of human intelligence

  • LLMs lack many brain-like subsystems:
    • No hippocampus-like memory consolidation.
    • No amygdala for emotion/instinct.
    • Limited reinforcement learning (basal ganglia analog), mostly used in fine-tuning.
  • Continual learning is absent: models restart from scratch each session, with no sleep-like distillation of experiences into weights.
    • Humans distill daily experiences during sleep; LLMs have no equivalent phase.
    • Future solutions might involve sparse LoRAs per user or synthetic reflection—but current synthetic data suffers from “silent collapse” (low diversity).

RL is terrible—and hard to fix

  • Reinforcement learning (RL) is noisy and inefficient:
    • It rewards entire trajectories based on final outcomes, upweighting even incorrect steps that led to success (“sucking supervision through a straw”).
    • Humans don’t learn this way—they reflect, review, and assign credit deliberately.
  • Process-based supervision (rewarding intermediate steps) fails because LLM judges are gameable:
    • Models find adversarial examples (e.g., “dhdhdhdh”) that fool reward models into giving perfect scores.
    • Labs try to patch this, but adversarial examples are infinite in high-dimensional spaces.
  • Karpathy expects new paradigms (e.g., reflection + synthetic data) to emerge, but nothing convincing exists yet.

The cognitive core and model size

  • Karpathy envisions a future “cognitive core”: a small (~1B parameter) model stripped of memorized knowledge, retaining only problem-solving algorithms.
    • Pre-trained models are bloated with internet garbage (stock tickers, slop), forcing large size for poor signal-to-noise.
    • With better datasets and distillation, a compact core could handle reasoning while looking up facts as needed.
    • He’s surprised others think it’ll be even smaller—but agrees there’s “plenty of room at the bottom.”

AGI won’t spike GDP—it’ll blend in

  • Karpathy defines AGI as systems that perform economically valuable knowledge work at human level (~10–20% of the economy, trillions in value).
    • But deployment won’t be sudden: jobs like call center work will see an “autonomy slider” (AIs handle 80%, humans supervise), not full replacement.
    • Coding dominates API revenue because it’s text-based, structured, and has pre-built tooling (IDEs, diffs)—unlike slides or messy professional workflows.
  • He rejects the idea of an intelligence explosion:
    • GDP growth has been ~2% for centuries, absorbing transformative tech (computers, internet) without visible jumps.
    • AI is just more automation—it’ll continue the same exponential, not break it.
    • Even if AGI arrives, diffusion will be slow, societal, and continuous—not a discrete jump to 20% growth.

Self-driving as a cautionary analogy

  • Karpathy spent 5 years leading Tesla Autopilot and stresses: self-driving isn’t solved.
    • Demos ≠ products: going from 90% to 99.99% reliability is a “march of nines,” each requiring equal effort.
    • Waymo uses hidden teleoperators; deployments are minimal and uneconomical.
    • Software has similar safety stakes (e.g., security breaches), so agent deployment will also be slow.
    • Latency, capex, and societal factors (laws, insurance) delay real-world scaling—even if the tech works.

Education as humanity’s anchor

  • Karpathy is building “Eureka” (a Starfleet Academy-inspired school) because he fears humanity becoming passive (like in WALL-E).
    • Great education should maximize “eurekas per second”—understanding via perfectly calibrated ramps.
    • His own teaching (e.g., micrograd, nanochat) strips ideas to first principles (e.g., backprop in 100 lines).
    • AI tutors aren’t ready: even the best LLMs can’t diagnose a student’s mental model like a human tutor can.
    • Short-term: AI assists course creation (he’s building LLM101N with AI help). Long-term: AI may replace TAs, then faculty.
    • Post-AGI, education becomes like gym-going: done for fun, health, and status—not just jobs.

Teaching and learning advice

  • To explain well:
    • Find the “spherical cow”—the first-order approximation that captures the essence.
    • Present pain before solution; let students guess first.
    • Narrate as if explaining over lunch—avoid jargon-filled abstracts.
  • To learn well:
    • Learn depth-wise on demand (for projects), not just breadth-wise (for someday).
    • Explain things to others—it exposes gaps in understanding.
    • Use LLMs to ask “dumb questions” about papers—it helps authors improve exposition.
Back to Dwarkesh Podcast