AI Will Create the Next 1 Billion Software Creators — Unsupervised Learning

Amjad Masad, co-founder and CEO of Replit, joins the Unsupervised Learning podcast to discuss how AI is reshaping software development, the future of coding, and why Replit built its own AI model. Replit, valued at over $1.2B after raising nearly $100M, powers over 20 million developers and is embedding AI deeply into its platform to lower barriers to building software. The conversation spans advice for new coders, the bifurcation of software engineering roles, Replit’s AI strategy, the state of open source models, agentic workflows, and the competitive landscape dominated by Microsoft.

Learning to code in the age of AI

The best way to learn coding has always been by making things, not through abstract academic study—this aligns with how intelligence works: goal-directed learning.
LLMs take “learning by doing” to its extreme: users can get a working prototype in minutes by prompting or forking templates, avoiding setup drudgery.
New coders should start with a project idea, then learn incrementally by iterating, debugging, and using AI as a co-pilot across tools (Replit, GitHub, Stack Overflow, multiple LLMs).

The future of software engineering roles

Software creation is bifurcating into two paths:
- Product engineers/creators: focused on building user-facing apps, iterating via prompts, and shipping fast—may not care deeply about traditional coding.
- Traditional software engineers: working on infrastructure, backend systems, cloud pipelines—this path remains more stable and still benefits from CS degrees.
AI will accelerate the rise of the “product creator,” especially entrepreneurs who want to build and ship without deep coding expertise.

How Replit uses AI

Replit renamed its AI feature from “Ghost Rider” to Replit AI, embedding it natively into the product rather than treating it as an add-on.
Their philosophy: AI should be foundational, not bolted on—every interaction should be AI-aided from the first keystroke.
Key AI features include:
- Code suggestions while typing (like GitHub Copilot’s ghost text).
- File generation via right-click + prompt.
- AI debugging: one-click error explanation with context-aware chat.
- AI workflows sprinkled throughout the product (e.g., fixing errors, generating tests).
AI is included in the free tier to ensure broad access and align with Replit’s mission.

AI’s current coding capabilities

LLMs are fundamentally data compression and interpolation machines—their power comes from the quality, diversity, and freshness of training data.
Coding ability improves not just with code data, but also with adjacent reasoning data (e.g., scientific papers, legal text).
Open source coding tokens are becoming scarce; companies like GPT-4 use proprietary human-annotated data at scale.
Replit benefits from proprietary user-generated code, giving them unique training data.
Expect 2–3 more years of coding capability gains through better data and scale.

What makes the best training data for code models

Size, freshness, diversity, and quality matter most.
High-quality data comes from top programmers, not just popular repositories (which skew toward infrastructure, not applications).
Replit’s user base generates high-quality application code, which is rarer and more valuable than library code.
Small models (e.g., Replit’s 3B parameter model) can be trained effectively on high-quality data with multiple epochs.

Who benefits most from AI coding tools?

Beginners see the highest ROI: Replit users have gone from zero to launching startups in months.
Studies (e.g., BCG consultants) show AI helps lower-skilled users more—but this may change once users are trained in prompting techniques (e.g., chain-of-thought, model switching).
Advanced developers who combine coding skill with sophisticated AI usage will likely pull ahead over time.

User education and adoption

Younger users adapt faster—they build mental models of AI capabilities intuitively.
Teachers initially resisted AI in classrooms (“you destroyed my classroom”), but many soon saw students learning better and faster.
Replit’s stance: AI is like the calculator—resistance is futile; better to learn how to use it early.

Structuring AI teams

Replit uses a horizontal AI team (7–8 people) that serves the whole product, not siloed verticals.
Belief: AI will touch every part of software, so teams should be built around AI-native workflows from day one.
Speed of AI adoption in coding has been fast in some ways (e.g., Copilot’s corporate uptake), but slower in consumer applications (e.g., Siri still can’t handle complex tasks).

Why Replit built its own model

Commercial models (e.g., GPT-4) couldn’t meet Replit’s latency and cost requirements, especially for free-tier users.
Open source models (e.g., Salesforce’s 3B model) proved small models could be capable.
Training their own 3B model cost only $100K—a small investment for strategic control.
Now, Replit uses a hybrid approach: in-house models for latency-sensitive tasks (e.g., autocomplete), commercial models for others.
Strategic reason: if you’re an AI company, you need internal talent and control over your stack.

The myth of “open source” models

Current “open source” models (e.g., LLaMA) aren’t truly open—you can’t reproduce them without the original training data and infrastructure.
Analogy: it’s like getting Linux binaries without source code or a compiler.
Companies depending on open source models are at the mercy of Meta’s goodwill (or Zuckerberg’s mood).
Long-term, companies should treat open source models like commercial ones: useful for prototyping, but not for core dependencies.
True open source requires reproducible training, community contributions, and a flywheel of improvement.

The world is getting weirder

AI-generated media (e.g., fake Kim Kardashian/Taylor Swift calculus video) is blurring reality.
We’re moving toward hyperreality (Baudrillard’s concept), where distinguishing real from synthetic becomes nearly impossible.
Lack of tools to detect AI media is a market failure—possibly because there’s no profit in it.

Pricing and cost challenges

Inference cost is a major constraint, especially for agentic workflows (recursive model calls, background tasks).
GPT-4 is powerful but expensive for agents; most users won’t pay for high-failure-rate autonomous workflows.
Replit uses usage-based pricing with bundles and overages to align cost with value.
Trend: usage-based pricing will grow as AI usage varies widely across users.
Recommendation: price based on value delivered, not cost-plus—project forward falling inference costs.

The future of agents

Agents (AI that acts autonomously on your behalf) are the next big leap, more transformative than multimodal.
Current agentic capabilities in LLMs are accidental—they weren’t designed for action, but they can chain thoughts and call functions.
True agents need reliable function calling, not just 90% success—failures in financial or legal contexts are catastrophic.
Milestone to watch: agents that can follow a bulleted list of actions without going off the rails.
Startups should build now with available tools (e.g., GPT-4), even if expensive, to learn and iterate.
Replit is researching how to train more effective agentic models.

Will Microsoft win the AI coding market?

Default assumption: Microsoft wins due to enterprise reach, sales force, GitHub integration, and Copilot.
But opportunities exist for specialized startups:
- Best-in-class test generation (e.g., Codium).
- Holistic dev environments with full context (Replit’s approach: AI over entire stack, repos, git history).
Pure code generation startups (e.g., Poolside) face a leapfrog risk: by the time they match GPT-4, GPT-5 arrives.
Code LLaMA’s release is promising—open source may catch up, but “vibes” (user experience) still lag behind GPT-4.

Overhyped and underhyped in AI

Overhyped: chatbots—many things shouldn’t be chatbots.
Underhyped: using LLMs as components in backend systems, not just user-facing interfaces.

Key lessons from building AI at Replit

Latency matters enormously: 300ms vs. 2–3 seconds changes the entire UX—flow state depends on speed.
Failed experiment: inline contextual actions (like Cursor) initially flopped, but adoption grew after better UI prompting.
Biggest surprise: how much latency shapes product design.

Amjad’s hot takes

OpenAI is impressive due to Sam Altman’s ambition and ability to spin many plates (education, robotics, partnerships).
Perplexity is technically excellent—engineered its way ahead of competitors.
In 10 years, companies will shrink dramatically—10x fewer engineers needed due to AI.
The number of “software creators” will grow, but they won’t be called engineers—just as TikTok stars are called “creators,” not “movie stars.”

Final thoughts

Replit’s blog (blog.replit.com) and Amjad’s personal site (amjad.me) share technical insights.
Try Replit at replit.com to experience AI-native development firsthand.
The future belongs to those who treat AI as foundational—and run through walls to build it.

Summary

Learning to code in the age of AI

The future of software engineering roles

How Replit uses AI

AI’s current coding capabilities

What makes the best training data for code models

Who benefits most from AI coding tools?

User education and adoption

Structuring AI teams

Why Replit built its own model

The myth of “open source” models

The world is getting weirder

Pricing and cost challenges

The future of agents

Will Microsoft win the AI coding market?

Overhyped and underhyped in AI

Key lessons from building AI at Replit

Amjad’s hot takes

Final thoughts