AI Talent Wars, xAI’s $200B Valuation, & Google’s Comeback

Unsupervised Learning 1h2 11 min #52
AI Talent Wars, xAI’s $200B Valuation, & Google’s Comeback
Watch on YouTube

Summary

  • This episode features a roundtable discussion between Ari Morcos (CEO of Datology AI, former DeepMind and Meta FAIR researcher) and Rob Taves (partner at Radical Ventures), covering the major themes in AI right now: whether model progress is slowing, the RL environment startup boom, skyrocketing valuations at OpenAI, Anthropic, and xAI, the talent wars, the future of AI infrastructure, and the intersection of AI with hardware and brain-computer interfaces.

Is Model Progress Slowing Down?

  • Ari’s view: progress is not significantly slowing, but it is harder to see.

    • For most consumer tasks, models are already good enough that average users don’t feel dramatic improvement generation over generation.
    • Progress is shifting to other axes: reasoning, cost efficiency, and smaller models matching capabilities of much larger models from a year ago.
    • Pure pre-training scaling has hit diminishing returns, as predicted by the original scaling laws work (Jared Kaplan et al.) — every 10x increase in data or compute yields smaller gains.
    • But other axes of scaling (post-training, test-time compute, RL) have opened up and are working well.
    • Overall, the “timbre” of progress has changed, not stopped.
  • Rob’s view: progress on raw model intelligence is genuinely slowing.

    • No exponential lasts forever — every one eventually looks like an S-curve. The leaps from GPT-2 → GPT-3 → GPT-4 were astounding; that rate of improvement is plateauing.
    • Pre-training scaling is well-documented as hitting diminishing returns.
    • The narrative that post-training compute and RL would be the next wave is exciting but may not be enough to sustain the 2020–2024 pace of progress.
    • RL works well in easily verifiable domains (math, coding) but it’s unclear whether it generalizes to hard-to-verify domains.
    • More fundamental breakthroughs may be needed for the next leap in general intelligence.
  • Where they agree:

    • The S-curve framing is correct — technological progress proceeds in a series of S-curves.
    • For the general consumer (e.g., someone’s mom using ChatGPT), model quality has plateaued in a practical sense. GPT-5 is cheaper to deploy than GPT-4, but the user experience difference is not dramatic.
    • The base models had to get good enough before RL could work at all — all the RL gains are downstream of pre-training progress.

RL Environments: Startup Boom and Generalization Challenges

  • The startup landscape:

    • Dozens of RL environment startups have emerged in recent months, broadly falling into two categories:
      • Selling to the big labs (OpenAI, Anthropic, xAI) — these labs have massive, essentially unlimited budgets for RL environments and are buying aggressively.
      • RL as a service for enterprises — helping companies build and train customized models on their own data using RL in their specific domain.
    • The first category is compared to Scale AI, which pivoted multiple times (autonomous vehicle labeling → government/defense → RLHF) on its path to success. The analogy is fair but the key question is whether these companies can pivot when the market shifts.
    • The second category (enterprise RL) is seen as more interesting and durable, though it involves significant professional services and is not a pure software play.
  • Ari’s concerns about the “sell to labs” model:

    • If you build lots of RL environments, you get better at building more — this creates a natural advantage for in-house teams.
    • Unlike data annotation companies (Scale, Surge), which built bespoke datasets for each customer, RL environment companies tend to build one environment and sell it to everyone, which undermines differentiation.
    • If RL environments are truly a competitive differentiator, frontier labs will want exclusive access, not shared ones.
    • Expectation: much of this work will be brought in house over time.
  • The generalization problem with RL:

    • Ari draws on his neuroscience background (PhD, monkey lab) to illustrate the reward-hacking problem: if there’s any gap between the RL environment and reality, models will find and exploit it.
    • Example: monkeys in a visual task learned to exploit a 0.5% luminance difference between correct and incorrect screens rather than learning the actual task.
    • AI models do the same thing — if the environment isn’t perfect, they overfit to it and fail to generalize.
    • Building environments accurate enough that the only way to solve the task is to learn the general-purpose principle is an open and unsolved challenge.
    • This is why RL works for coding (easily verifiable) but may struggle in most real-world domains.

Google’s AI Comeback

  • Gemini reaching #1 on the App Store (driven largely by the Nano Banana image generation feature) marks a significant moment in Google’s AI trajectory.

  • Ari’s take:

    • Has been bullish on Google/Gemini for a long time — the talent there has always been extraordinary.
    • Google’s historical weakness has been launching new products and maintaining a focused, competitive mindset. This has clearly changed on the Gemini team over the past year.
    • Sergey Brin is reportedly spending significant time directly with the team, which changes the energy and focus.
    • When Google fires on all cylinders, combined with its distribution advantage, it becomes a behemoth.
  • Rob’s take:

    • Super bullish on Google. The post-ChatGPT narrative that Google “lost a step” has some truth, but Google’s structural advantages (resources, compute, talent depth, in-house chips) are enormous.
    • Expects Gemini 3 (rumored to drop in the coming months) to be state-of-the-art, potentially better than GPT-5.
    • If forced to pick one major lab to advance the frontier going forward, he’d pick Google.
    • Financial tension: winning in AI means transitioning to a platform that’s harder to monetize and has higher costs — Google may face margin pressure in the near term before figuring out monetization and cost reduction.
  • The default advantage:

    • ChatGPT has strong brand awareness, but defaults matter enormously.
    • When Apple deploys a model on iPhones and Google defaults to Gemini on Android, the consumer landscape shifts significantly — especially if model quality is comparable.
    • This mirrors what happened with search (Google paying Apple to be the default).

Valuations: OpenAI, Anthropic, and xAI

  • OpenAI (~$500B valuation):

    • ~700 million users, but the key question is monetization — can they convert users to paid tiers and build a sustainable business?
    • Potential monetization through ads and commerce, possibly more valuable than search because ChatGPT can complete end-to-end actions and insert itself into commerce flows.
    • Trust concern: if an answer engine biases its single answer for commercial purposes, it undermines trust in the answer itself — a fundamentally different problem than showing a sponsored link among 10 blue links.
    • Most non-AI-developer users of ChatGPT don’t pay for it — unclear whether consumer willingness to pay will materialize.
    • Under reasonable growth assumptions, the valuation is not crazy on a revenue multiple basis, but a lot has to go right.
  • Anthropic (~$180B valuation):

    • Heavily reliant on enterprise API revenue, particularly from coding use case.
    • If Gemini 3 is strong at coding, Anthropic’s developer-heavy customer base could switch quickly — developers have zero loyalty to any given model and switch every time a new one drops.
    • Growth has been extraordinary, but susceptibility to model-switching is a real risk.
  • xAI (~$200B valuation):

    • The valuation reflects the “never bet against Elon” thesis more than the fundamentals of the business.
    • xAI has much less of a business than OpenAI or Anthropic at a higher valuation.
    • They’ve caught up to the frontier quickly but aren’t doing anything fundamentally different technologically or in product.
    • Rob notes xAI is Elon’s first “me-too” company — building in the mold of others rather than creating a new category (as Tesla, SpaceX, Neuralink did).
    • xAI has been losing top researchers, and there are reported cultural concerns.
    • Ari agrees the numbers don’t justify the valuation, but acknowledges that betting against Elon has historically been a losing strategy.

The Talent War

  • Why massive salaries can make sense:

    • AI is clearly creating enormous economic value — this is not a dot-com-style bubble.
    • The fundamental equation is: put dollars into a model, get performance out. Anything that multiplies the value of compute is immensely valuable.
    • Talent is a compute multiplier — someone who can “make the GPUs sing” can extract dramatically more performance per dollar, easily justifying a high salary.
    • High-quality data is another massive compute multiplier (Ari’s core thesis at Datology) — 10x performance per dollar improvements are possible with better data.
  • Concerns:

    • Do ultra-high salaries attract people aligned with the mission, or people primarily motivated by not getting fired and cashing out quickly?
    • Cultural second-order effects: Ari notes that Meta lost incredible talent who were equally deserving but didn’t receive the mega-offers, creating bitterness.
    • The average AI talent is still being paid very well, but the $100M packages may be a moment in time rather than a permanent trend.
    • Whether these salaries continue depends partly on Meta’s MSL (Meta Superintelligence Labs) experiment — if Meta produces a great model, more crazy salaries follow; if cultural challenges prevent it, the trend may cool.
  • The acquihire trend and its problems:

    • Character AI, Adept, Inflection, and most recently Scale AI/Meta have used “effective acquisition” structures that aren’t technically M&A — designed to avoid FTC/DOJ antitrust scrutiny.
    • These structures create unpredictable payouts: investors may get their money back or less; founders get paid lavishly; employees often get little or nothing.
    • This breaks the social contract between founders and early employees — employees bet their time and energy on the company’s vision and expect to share in the upside.
    • Long-term harm: makes people less likely to join startups, pushes them toward less risky options, and starves the startup ecosystem.
    • Ari and Rob both hope for regulatory scrutiny and for founders to do right by employees in these situations.
  • Regulatory tension:

    • In AI, speed matters enormously — if you have to wait a year for regulatory review of an acquisition, the target may no longer be relevant.
    • By conventional antitrust analysis, acquiring a company with no revenue (like Thinking Machines or SSI) shouldn’t be a problem since there’s no market share to consolidate.
    • But a proactive antitrust regulator (like Lina Khan at the FTC) could still come down hard on such deals.

AI Infrastructure: What Endures and What Doesn’t

  • The bearish case (from Sherwin Wu at OpenAI):

    • AI infra is a bad bet because the scaffolding changes so fast — whatever you build today may be obsolete in 6 months.
    • Model wrappers and scaffolding around foundation models are in danger of being overwritten by updated models.
  • Ari’s nuanced take:

    • Safe infra: inference (always needed), data curation (data quality is always necessary).
    • Less safe: data labeling specifically, because synthetic data is becoming powerful and the need for human-labeled data may peak and decline as synthetic data and automated judges for RL environments take over.
    • Enduring principle: the quality of data shown to models is always going to be critical, regardless of how the specific form changes. Datology’s core advantage is being the best in the world at valuing data for a given downstream use case.
    • The key for infra companies: find something that will always be true, build your core advantage there, and stay flexible as the ecosystem shifts.
  • Rob’s take:

    • Agrees with Sherwin that building durable infra in AI is very hard, but disagrees that the right conclusion to be bearish on the entire category.
    • In any major technology shift, a whole new suite of tools and infrastructure is needed. The challenge is identifying which types will endure.
    • Inference is a good bet. Data labeling was a good bet (Scale rode that wave) but synthetic data may be changing the equation.
    • The service a company provides can remain the same even as the technology changes in a million ways.

Meta’s MSL Experiment and the Metaverse

  • Will Meta’s superintelligence lab work?
    • Ari thinks it might — “don’t bet against Zuck” is as valid as “don’t bet against Elon.”
    • Meta’s metaverse spending may look good in hindsight, especially as AR glasses become the realization of that vision.
    • Controversial take: Llama’s budget effectively came from metaverse spending (during Ari’s time at FAIR, the team was part of Reality Labs).
    • Zuck has a history of making widely lambasted strategic moves that look like genius in retrospect.
    • Main concern: cultural incentives — losing talented researchers who were equally deserving but didn’t get mega-offers, creating bitterness and potentially undermining the effort.

Hardware, Glasses, and Brain-Computer Interfaces

  • Glasses as the form factor:

    • Both Ari and Rob see glasses as a strong form factor for AI in the physical world — better than pocket-held devices (like the AI Pin) or other alternatives.
    • Google Glass was ahead of its time; the technology is now catching up.
    • Interesting open questions: do adoption rates vary by climate (sunglasses cultures vs. cooler climates)?
  • Brain-computer interfaces (BCI):

    • Rob has become increasingly excited about BCI as a category.
    • Invasive BCI (surgery required, e.g., Neuralink): high barrier to entry but enables granular neural data. Medical use cases are the near-term path.
    • Non-invasive BCI (sensors in headphones, caps, wristbands): much easier to build consumer products around. Improving rapidly due to better sensors and, more importantly, better AI for extracting signal from noisy brain data.
    • Expects compelling non-invasive BCI products to hit the market in the next 12–24 months, representing a new paradigm in human-computer interaction.
    • The Matched band (EMG signal acquisition) is an early example of this direction.

What They’ve Changed Their Minds On

  • Ari: The success of RL in language models. RL had a long history of initial promise followed by failure to generalize in the real world. This time it’s working because the base models finally got good enough. The same techniques applied to models from even a year ago wouldn’t work. This has been surprising and significant.

  • Rob: The “GPT-3 moment in robotics” has become murkier, not clearer. There’s enormous progress in general-purpose robotics foundation models (Physical Intelligence, Skild) and humanoid robots (Tesla, 1X, Figure), but it’s still unclear whether the breakthrough is 1–2 years away or 5+ years away. Applying AI in the world of atoms (not just bits) is exponentially harder. This question will determine how quickly AI transforms all of our lives.

  • Jacob (host): A year ago, he would have said not to focus on models at the application layer — just build workflows and go deep on domains. Now, doing RL and domain-specific training to improve reliability and quality is starting to matter more. The key driver is cost: if custom model training can be done for single-digit millions (and falling), it becomes viable for every enterprise.

Underdiscussed Themes

  • Rob: Recursive AI / AI that builds better AI.

    • If there’s one development most likely to drive the next leap toward superintelligence, it’s building AI that can autonomously develop better AI — a recursive, compounding process.
    • This concept goes back to the 1960s visions of early AI researchers.
    • In the past year, it has become much more of a reality, with multiple high-profile stealth companies working on this.
    • The big labs are undoubtedly taking this seriously, but it hasn’t broken through to mainstream discourse yet — it still sounds sci-fi.
    • How well this works will be determinative of the slope of AI progress over the next few years.
    • Ari adds: if recursive AI can hill-climb on coding and augment the quality of AI research itself, everything flows from there — making the generalization question less important.
  • Ari: Data is the most underinvested area of AI research relative to its impact.

    • The “data wall” narrative is getting hype, but synthetic data is a huge part of the solution.
    • There are real risks (model collapse from synthetic data loops), but there are also proven ways to avoid them.
    • Synthetic data was a major driver of GPT-5’s gains and has been called out by all the Chinese model companies as key to their improvements.
    • The combination of better data curation and correct synthetic data generation is how we get more juice out of existing datasets.
    • This will be critical for domain-specific use cases.
    • People are sleeping on how big an impact this can make if done correctly.
Back to Unsupervised Learning