- Dylan Patel (SemiAnalysis) and Jon (Asianometry) break down how the semiconductor industry actually works, why it matters for AI scaling, and where the real bottlenecks and opportunities lie
- The semiconductor supply chain is extraordinarily stratified and specialized, with each layer dominated by only a handful of companies, and the knowledge required to operate at each layer is so deep that almost no single person understands the full stack
- This complexity is central to understanding AI scaling, export controls, China’s semiconductor ambitions, and where the industry is headed through 2028-2029
The semiconductor stack is almost impossibly complex
-
Extreme stratification: Every layer of chip manufacturing, from chemicals to lithography tools to wafer fabrication to advanced packaging, has only 1-3 major competitors globally, with market shares typically split 70/25/5
- This is not by accident: the capital requirements and specialization needed at each layer make it nearly impossible for one company to do everything
- Vertically integrated companies that tried to do it all in-house fell behind those that accepted better external tools and products
-
Knowledge is master-apprentice, not documented: Critical process knowledge (like etch chemistry or hydrofluoric acid recipes) is passed down through apprenticeship relationships, not published papers
- In the US, the apprentice pipeline dried up and the knowledge migrated to Taiwan, where it is still transmitted through institutions like National Tsing Hua University
- Most semiconductor equipment still runs on Windows XP; chip design tools run on CentOS 6; the tech stack is ancient but hyper-optimized and too risky to change
-
Nobody knows the whole stack: Even top experts at one layer (say, lithography) have only surface-level understanding of adjacent layers
- The industry coordinates through conferences, gossip, and a shared sense of direction that emerges organically, not through centralized planning
- Gordon Moore’s original observation became a self-fulfilling prophecy: tens of millions of engineers across the world aligned around hitting the next shrink target
-
The search space is almost infinite: A leading-edge chip like NVIDIA’s Blackwell has ~200 billion transistors connected across 15 metal layers, making the design search space larger than any other problem humans attempt
- AI tools (RL-based chip layout, inverse lithography) are beginning to find 5-15% improvements in small parts of the design, but the data is siloed within companies, limiting how much AI can help in manufacturing
How chips actually get better: process nodes and recipes
-
A process node is a recipe, not just a shrink: Every TSMC recipe is the culmination of years of research, involving thousands of sequential processing steps with knobs (pressure, temperature, chemical composition) that must be tuned across every tool
- Improving yield is a multivariable optimization problem: even at 99% yield per step, with 10,000+ steps, final yields can be very low
- China’s SMIC gets bad yields at 7nm partly because they are forced to use older DUV lithography instead of EUV, and partly because the multiplicative effect of many small imperfections compounds
-
Moore’s Law is slowing and getting expensive: Moving from 90nm to smaller nodes used to double density; now moving from 5nm to 3nm gives only ~20% power savings per transistor
- SRAM doesn’t scale at all anymore; logic scales but only ~30% per node
- The real benefit for AI comes from data locality: fitting more computation on-chip means less expensive off-chip data movement
- N2 (2nm) is economically questionable without AI demand subsidizing it; Apple alone can no longer justify the cost of leading-edge nodes for mobile
-
Liang Mong Song’s story illustrates how talent moves the industry: A TSMC genius with ~285 patents, he lost an internal power struggle, went to Samsung, helped them leapfrog TSMC to the leading edge and win Apple business, then was sued by TSMC and moved to SMIC, where he rapidly advanced their process
- TSMC responded with the “Nightingale Army” (24/7 R&D shifts for 1-2 years, called “burning your liver” in Taiwan) to finish FinFET and retake the lead
- This pattern of talent flowing from Taiwan to China has been a major accelerant for China’s semiconductor industry
China’s semiconductor position: constrained but dangerous
-
Export controls are partially working but have a logical flaw: The US restricts China from buying advanced chips (NVIDIA H20, etc.) and restricts equipment sales, but the restrictions are inverted
- China can currently build chips domestically that are better than what the US allows NVIDIA/AMD to sell them, which is backwards if the goal is to keep China behind
- The controls have been incremental (removing one jigsaw puzzle piece at a time) rather than a full ban, allowing China to develop domestic alternatives for each restricted piece
-
China’s domestic chip capacity is larger than people think: SMIC has 45-50 high-end immersion lithography tools in Shanghai, giving roughly 25,000-35,000 wafers/month of 7nm capacity
- With 50-80 good dies per wafer (at poor yields), this is millions of chips
- They are building 7nm capacity in Beijing and 5nm capacity in Beijing while telling the US and ASML it’s “for 28nm”
- Huawei’s Ascend 910B (~400 teraflops) has ~600,000 units produced; if centralized into one cluster, it would be larger than any single US lab’s cluster
-
China’s real advantage is power and construction speed: China adds as much power capacity as half of Europe every year; they could build a 10-gigawatt data center near the Three Gorges Dam (which already had ~10 GW of Bitcoin mining) in months
- The US adds very little power annually; Europe loses power; China has no problem with power density
- A gigawatt data center would be easy to hide among China’s existing industrial power consumption (aluminum mills, rare earth refining, etc.)
-
If China centralizes compute, they could match US training runs by 2026-2027: They receive over a million H20 GPUs legally per year, plus domestic Ascend chips
- No Chinese company has yet built a 100,000+ GPU cluster, but if Xi Jinping directed all chips to one site, they could have a larger single training run than any individual US company
- The tradeoff: centralization risks killing the innovation benefits of decentralized experimentation (DeepSeek’s success came without government direction)
Huawei: the most cracked company in China
-
Huawei out-competes Western firms with two hands tied behind their back: Sanctioned from TSMC, banned from Western markets, using process nodes 3-4 years behind, yet their latest phone performs within a year of Qualcomm’s best
- They make telecom equipment, phones, modems, accelerators, networking gear, video surveillance chips, and are now entering cars
- Their culture is described as “struggle” (a Communist Party concept) combined with Andy Grove’s “only the paranoid survive” mentality
- Employees are told their work is existential for the country; this drives a 9-9-6 work culture and extreme motivation
-
Espionage is real but not the full explanation: ASML has been hacked multiple times; Cisco code appeared in early Huawei routers; people have been sued for taking documents to China
- But the Ascend 910B architecture looks nothing like a GPU or TPU; it is independently designed
- China is genuinely good at engineering, not just copying; the combination of espionage, talent, and genuine capability makes them formidable
AI scaling: the numbers through 2028-2029
-
Cluster sizes are growing 3-7x per year: 2025’s largest cluster is ~100,000 GPUs (xAI Memphis); 2026 will see 300,000-700,000 GPU equivalent clusters (multi-site, connected by fiber)
- Microsoft has signed $10+ billion in fiber deals to connect five regions of data centers together
- By 2026, a single gigawatt site will exist; total multi-site capacity will be 2-3+ gigawatts
-
1e30 total flops delivered to a model is possible by 2028-2029: This is ~100,000x more than GPT-4, but the counting method matters
- Pre-training flops alone won’t be 1e30; the total includes synthetic data generation, post-training, RL, inference-time compute, and verification
- This would require multi-hundred-billion-dollar clusters, but prior-generation clusters can be repurposed for data generation and verification
-
Chips are not the near-term bottleneck; data centers and power are: Today, ~4-5% of NVIDIA’s Hopper production goes to the largest cluster; there are ~6 million Hoppers manufactured
- The constraint is concentration: building one site with enough power, substations, transformers, and cooling
- Power is sub-15% of total cost of ownership for GPUs; the servers themselves are ~75-80% of cost
- The US industrial base for transformers, substations, and power infrastructure has atrophied (flat power demand for years) but can ramp in 6-18 months when given demand signals
-
TSMC’s leading-edge capacity will shift dramatically toward AI: Apple has been ~25% of TSMC’s business and >50% of the newest node; this paradigm is ending
- By 2028, AI could be 60-80% of TSMC’s leading-edge capacity (N2, A16, A14 nodes)
- Apple doesn’t need the capacity they’re being allocated; AI demand is what’s making new nodes economically viable
The investment picture: Pascal’s wager
-
Tech CEOs are making a Pascal’s wager on AI: Satya Nadella, Sundar Pichai, Mark Zuckerberg, Sam Altman, and Dario Amodei have all said or acted on the principle that the risk of under-investing in AI is worse than the risk of over-investing
- If AI is real and you don’t invest, you lose everything; if AI is not real and you do invest, you lose money but survive
- This logic justifies massive capital expenditure even before revenue catches up
-
The money is flowing: Private capital investment in AI is ~$55-60 billion in 2024, far below the dot-com bubble’s ~$150 billion/year
- OpenAI is reportedly raising $50-100 billion; xAI can raise $30+ billion; Anthropic has barely diluted
- Microsoft is taking on enormous credit risk ($50-80B direct CapEx plus $20B+ through partners like CoreWeave and Oracle) because they believe in the OpenAI partnership
- Sovereign wealth funds (UAE, Saudi Arabia, Canada pension fund) are also investing heavily
-
Revenue lags investment by years: GPT-4 cost ~$500M to train and generated billions in revenue; the next cycle requires $10B+ training runs before revenue materializes
- The bet is that GPT-5 (or equivalent) will be impressive enough to justify the next round of fundraising
- If GPT-5 disappoints, the bubble could deflate; if it delivers, the capital will keep flowing
-
Historical bubble comparison: Each major tech bubble (PC, semiconductor, dot-com) was larger than the last; there’s no reason this one can’t be bigger
- The dot-com bubble laid fiber infrastructure that enabled the modern web; an AI bubble would lay compute infrastructure that enables whatever comes next
- Unlike the 90s bubble, today’s investments are being made by the most profitable companies in human history, not debt-financed startups
How Dylan and Jon built their businesses
-
Asianometry (Jon): Started in 2017 as a tourist video channel for his mom while working a full-time job in cameras and running a textile business; labored for 3 years with ~200 views per video before gaining traction
- Now produces 2 videos per week on Asian business history, semiconductors, and geopolitics, while having only recently given up the textile business
- Research method: Google countries and industries, compare what they export now vs. historically, find interesting stories; maintains a long list of vague ideas (e.g., “Japanese whiskey”) that develop over time
-
SemiAnalysis (Dylan Patel): Started as an obsessive hobbyist who learned hardware by fixing his Xbox 360 at age 8, then spent years on forums, investing in semiconductor stocks since age 18, and building technical knowledge through engineering textbooks and conferences
- Worked a data science job for 3 years after college while posting anonymously online; quiet-quit in 2020 and became a digital nomad
- Started consulting from his online persona in 2020, raised prices arbitrarily and it worked; launched a paid newsletter in late 2021 that gained 40 paid subscriptions overnight from one post about photoresist
- Now attends ~40 technical conferences per year across every layer of the semiconductor stack; has built a 14-person company with ex-ASML, ex-Microsoft, ex-hedge fund employees across the US, Japan, Taiwan, Singapore, and France
- Tracks every server manufacturer, component manufacturer, cable manufacturer, and tool manufacturer, projecting where every data center is being built and at what pace; customers include virtually every hyperscaler, major semiconductor company, and major investor
Where the opportunities are
-
Memory is the most underinvested critical technology: Memory scaling stopped following Moore’s Law around 2012; gains have been incremental since then
- HBM exists because of DRAM limitations; breaking memory technology would change everything for AI
- Challenge: memory is a commodity industry that doesn’t allow custom devices, and requires absurd manufacturing scale
-
Every layer of the stack has room for innovation: The industry is so far from Pareto optimal that even single-digit percentage improvements at any layer are valuable
- AI tools can accelerate experimentation and optimization across the 100+ abstraction layers
- Entrepreneurship advice: find what you’re passionate about (even copper wires or B2B SaaS), work extremely hard, use AI to amplify your efficiency, and you’ll find opportunity
-
Architecture offers 100x gains even without process improvements: The vast majority of power on an H100 goes to data movement (networking and memory), not compute
- More efficient ALU designs, better data locality, improved memory technologies, and chip-to-chip networking could dramatically improve performance per watt
- Different hardware constraints lead to different optimal model architectures: Chinese models on memory-bandwidth-limited H20s will diverge from American models on GPUs, and both will diverge from Google’s TPU-optimized models