OpenAI: How AI is reshaping the craft of building software - The Pragmatic Summit — The Pragmatic Engineer

OpenAI’s internal engineering culture has shifted dramatically in the last 6 months, with AI coding tools evolving from autocomplete-like helpers into autonomous “teammates” that engineers run in parallel, sometimes for hours or overnight, while they attend meetings or even close their laptops. Tibo Sottiaux (Head of Engineering, Codex) and Vijaye Raji (CTO of Applications) describe how this is playing out inside OpenAI and what it signals for the broader industry.

Engineers on the Codex team routinely consume hundreds of billions of tokens per week, running multiple agents in parallel.
Codex Box (recently released internally) lets engineers reserve server-side dev boxes, fire off prompts, and let agents work autonomously — engineers can close their laptops, go to meetings, and return to completed work.
The team has moved through a rapid evolution: Codex as tool → extension → agent → teammate, and engineers are expected to start naming their agents and treating them as colleagues.
This is not uniform across OpenAI — the Codex team itself is the most advanced, but product engineering teams across the company are also heavily adopting AI tools.

The team reinvents its workflow almost weekly, constantly identifying and removing the next bottleneck (code generation → code review → understanding user needs → synthesizing feedback from Twitter/Reddit into strategy).
Engineers are starting to think in terms of a “compute envelope per employee” — a concept previously reserved for researchers training models.
Overnight self-testing runs: Codex can run autonomously for multiple hours, performing QA in a loop, flagging regressions, and even writing up findings in a PDF report for researchers to act on.
Meetings about Codex now include live Codex threads: during weekly analytics reviews or incident post-mortems, the team fires off Codex threads in the background to diagnose issues or answer data questions, then discusses the results by the end of the meeting.

Engineers now explore multiple implementations in parallel rather than debating trade-offs in a design doc and picking one.
Designers are shipping more code than engineers were 6 months ago, because model-generated code is good enough to merge as-is.
Command-line tools (e.g., ffmpeg) are being used through Codex as an interface — engineers describe what they want and Codex constructs and executes the command.
The bottleneck is expected to keep shifting: once coding is solved, code reviews become the bottleneck, then CI/CD and deployment — each requiring new tooling and practices.

OpenAI is hiring new grads aggressively and running a large internship program (~100 interns this summer).
Raji believes the next generation of engineers will be “AI native” — fluent in these tools from day one and able to leverage them immediately.
Onboarding onto the Codex team is flat and peer-driven: new hires use Codex itself to navigate the codebase and get daily reports, and the most recent onboarders are responsible for bringing the next hires up to speed.
A new grad who joined the Codex team six months ago is described as “absolutely crushing it,” with the team lead noting the new hire’s energy and speed exceed his own.
Foundations still matter: codebase architecture, code review, and guardrails remain critical. New grads absorb strong foundations quickly when the environment is well-structured. The concern that AI-native engineers will skip foundational learning is acknowledged but countered with the argument that abstraction has always increased (assembly → C++ → JavaScript → AI-assisted coding) and strong fundamentals plus product intuition remain the durable skills.

The core durable skills are strong foundations, product intuition, and the ability to move up and down the stack to solve problems.
The gratification cycle is much shorter: engineers can build, test, verify, and iterate in a single sitting (even on a plane, with the laptop half-closed to keep the agent running).
Raji draws a historical parallel to the resistance against IntelliSense — “you’re not a developer if you use IntelliSense” — and argues these objections have always faded as abstractions rise.

As long as products are built for humans, human product managers and designers remain essential — product sense and design sense have no substitute.
PMs and designers are already writing code and building prototypes to validate ideas before involving engineers, making them significantly more productive.
PMs are also using Codex for non-coding tasks like building PowerPoint slides and analyzing feature backlogs.

OpenAI runs active Slack channels, hackathons, and “show and tell” demo days to diffuse novel AI workflows quickly across the organization.
The depth of demos has been increasing steadily — from surface-level “look what’s possible” to polished, corner-case-handled usable products.
A single PM on the Codex team hyper-leveraged himself with Codex to run a bug bash: collecting feedback via Codex, filing bug reports and feature tickets into Linear, and following up with engineers — effectively becoming a “50x program manager.”

Inside OpenAI, everyone has unlimited token access — a significant advantage not available to most companies.
Raji frames the cost conversation differently: as agents become capable teammates, the question shifts from “how many tokens does this cost?” to “how much would you pay a teammate who works 24/7?” — and at that framing, the ROI becomes clear.
For cost-constrained teams:
- Don’t limit inference prematurely — the best people at a company should get large, comfortable allocations.
- Think about cost displacement: tasks like marketing research or backlog analysis that previously required 15 engineers can now be done almost free with AI.
- Hold AI providers responsible for making agents capable enough to justify the teammate framing.

Another order of magnitude in speed within 6 months, which will again change how teams work.
Large networks of collaborating multi-agent systems working on massive goals — e.g., rebuilding a browser from scratch in 24 hours, producing millions of lines of code that no single human can fully understand.
Code will become increasingly abstracted away: engineers will set guardrails and verify correctness through proofs or input/output contracts rather than reading code.
A personal AI assistant layer will emerge that represents the work of dozens or hundreds of background agents, so engineers don’t have to monitor each one individually — Raji expects to see this within the year.
As software systems become more complex and layered, debugging by symptoms (rather than reading code) will become a key skill, and tooling will evolve to support this.
Raji reflects on 25 years in the industry (dot-com bubble, Y2K, mobile revolution, social networks) and says nothing compares to the speed and scale of what’s happening now.

Summary