David Reich – Bronze Age shock, the Neanderthal puzzle, & the sudden spread of farming — Dwarkesh Podcast

David Reich, a geneticist at Harvard specializing in ancient DNA, discusses a major new preprint showing that natural selection has been far more intense over the last 10,000 years than previously believed — especially during the Bronze Age (roughly 5,000–2,000 years ago). This challenges the long-held view that human evolution slowed or stabilized after the emergence of anatomically modern humans. The study was made possible by a massive increase in ancient DNA sample sizes (around 16,000 ancient individuals from Europe and the Middle East) and a new statistical method developed by Ali Akbari in Reich’s lab.

Why this study is different from previous work on natural selection

For decades, the mainstream view in human evolution was that natural selection had been largely quiescent over the last several hundred thousand years.
- The key evidence: when comparing Europeans and East Asians (who diverged ~40,000–50,000 years ago), almost no genetic variants are at 100% frequency difference between the groups — suggesting not enough time or pressure for strong directional selection to fix any mutations.
- This led to the idea that humans reached some kind of genetic optimum long ago, with only minor tweaks since.
The problem was always sample size: detecting selection requires tracking how the frequency of a specific genetic variant changes over time, and with only a few ancient genomes, you get at most one or two copies of each variant per individual — far too little resolution.
Reich’s lab has spent years industrializing ancient DNA extraction and sequencing, generating data from thousands of individuals at low cost. Combined with a new statistical method, this finally provides the resolution to detect selection signals.

The methodology: how they detected selection

The core innovation is a method that asks: does assuming a constant direction of natural selection at a given DNA position predict the observed genetic data better than just knowing how related all the individuals are to each other?
- The relatedness matrix captures everything that affects the whole genome — bottlenecks, drift, migrations, admixture.
- On top of that, they test whether adding a selection coefficient at a specific position improves the prediction.
- If the same mutation is consistently pushed in the same direction across many different populations and time periods (after accounting for migration), that’s evidence of selection.
They analyzed ~10 million variable positions in the DNA across ~22,000 people (16,000 ancient, 6,000 modern).
To validate their findings, they used an independent dataset: genome-wide association studies (GWAS) from the UK Biobank, which link specific genetic variants to measurable traits (blood pressure, height, disease risk, etc.).
- They found that as their selection statistic increased, the enrichment for variants that affect real traits increased dramatically — about five-fold enrichment at the highest confidence level.
- This confirms that the signals they’re detecting are biologically real, not statistical artifacts.
- Above a selection statistic of ~5, essentially all signals are real; at lower values, they can calibrate the probability that a given signal is genuine.

Key findings: the scale of selection

They identified at least 479 positions in the genome that are independently under selection with 99% confidence, and about 7,200 positions at 50% confidence (meaning roughly 3,600 of those are real).
This is a dramatic increase from earlier studies, which found at most a couple of dozen signals.
Natural selection is not quiescent — it’s “rampant” in the genome, tugging variants in one direction or another almost everywhere, even though adaptive selection accounts for only about 2% of total frequency change (the rest being migration, drift, and population structure).

What traits are under selection

There is a vast enrichment (~4–5x) for immune-related traits among the selected signals, and a strong enrichment for metabolic traits (obesity, type 2 diabetes risk, fat distribution).
Almost no detectable enrichment for behavioral or psychiatric traits — but this does NOT mean behavior wasn’t under selection.
- Behavioral traits are underpinned by very large numbers of genes with tiny individual effects, making them much harder to detect than immune traits, which often involve fewer genes with larger effects.
- Reich argues there is clear evidence of selection on behavioral traits too — it’s just harder to see with current statistical power.

The Bronze Age shock: why selection intensified 5,000 years ago

The most surprising finding is that the strongest selection signals are concentrated in the Bronze Age (roughly 5,000–2,000 years ago), not in the earlier Neolithic transition to farming (~10,000–12,000 years ago).
- This is counterintuitive: the cartoon picture is that the biggest transition was the invention of farming. But the genome is reacting much more strongly to events 5,000 years ago.
The Bronze Age brought dramatically higher population densities, more intensive animal husbandry, urbanization, and new disease environments — creating intense pressure to adapt.
Specific examples of Bronze Age selection intensification:
- TYK2 variant: A major risk factor for tuberculosis that increased in frequency up to ~9–10% by 3,000 years ago, then reversed and decreased — possibly because tuberculosis became endemic and the variant’s cost then outweighed its benefit against other diseases.
- FADS1/2: A variant involved in converting plant fatty acids to long-chain fatty acids, under especially strong selection 5,000–3,000 years ago — relevant as diets shifted toward cereals.
- Lactase persistence: The ability to digest milk as an adult intensified during this period, consistent with the shift to using cattle for milk and wool, not just meat.
- Hemochromatosis: A pathogenic iron buildup condition that reversed in frequency around this period.
- Skin depigmentation: Europeans got lighter-skinned over the last 10,000 years, but the strongest period of depigmentation was 4,000–2,000 years ago.
- Cognitive performance predictor: Genetic variants that predict performance on IQ tests and years of schooling show very strong selection between 5,000 and 2,000 years ago — but essentially no selection in the last 2,000 years, despite industrialization and increasing societal complexity.

The cognitive performance signal and what it might mean

The polygenic score for cognitive performance (measured via IQ test predictors and years of schooling in modern white British people) shows about a standard deviation of change over the last 10,000 years — a huge effect.
- European hunter-gatherers score about 3 standard deviations below the modern mean on this predictor; early farmers are at the mean; steppe pastoralists are lower. These big jumps are due to migration, not selection.
- But after correcting for migration, there is a consistent directional selection signal pushing toward higher predicted cognitive performance during the Bronze Age.
This signal replicates in an independent dataset: the same genetic variants that predict years of schooling in Chinese people in China show the same trajectory of selection in Europeans over the past 10,000 years — despite these populations being essentially disconnected. This makes it extremely unlikely to be a statistical artifact.
The trait being selected is probably not “IQ” or “years of schooling” per se (neither existed in the past). It appears to be some general underlying trait — possibly related to executive function, deferral of gratification, or a propensity to invest fewer resources in more offspring — that manifests differently in different environments.
- In Iceland over the last century, there has been selection against this same genetic predictor (correlated with having children earlier), suggesting the direction of selection can reverse depending on environmental conditions.
- The trait is genetically correlated with many seemingly disparate things: age at first childbirth, obesity, walking pace, household wealth.

Why wasn’t intelligence maxed out in the ancestral environment?

Reich and Dwarkesh discuss why a trait as seemingly universally useful as intelligence wasn’t driven to a maximum long before the Bronze Age.
- One possibility: what we measure as “intelligence” (IQ test performance) may not map well onto the cognitive demands of hunter-gatherer life, which required a different kind of knowledge (food processing, shelter-building, navigation, etc.).
- Another: there may be trade-offs. In some environments, having many children with less investment per child is favored; in others, having fewer children with more investment is favored. The genetic predictor of cognitive performance may be linked to this life-history trade-off.
- The value systems of past societies (as reflected in the Bible, Homer, etc.) did not particularly prize the kind of abstract cognitive ability measured by IQ tests — they valued strength, courage, beauty, religiosity.
Reich notes that if humans had been under stronger selection for intelligence, they could have been much smarter — intelligence has not been the dominant trait under selection, and there appears to be “room at the top.”

The thrifty gene hypothesis and body fat

There has been clear selection against genetic variants that promote obesity and type 2 diabetes over the last 10,000 years — about a standard deviation of change.
- This supports the “thrifty gene hypothesis”: in hunter-gatherers, storing fat was advantageous due to boom-and-bust food availability; in agricultural societies with more stable food supplies, this became less necessary.
- Europeans are relatively better protected against type 2 diabetes than populations with shorter histories of agriculture (like African Americans and Native Americans).
- This goes against the common story that hunter-gatherers had more stable, varied diets — the genetic data suggests agricultural societies, despite famines, were more stable in food access than the feast-or-famine cycle of hunting.

Evolution is limited by time, not population size

A key question: could the Bronze Age selection signals simply reflect larger population sizes making selection more effective?
- Reich argues no. Once populations reach roughly 1,000–10,000 individuals, even very weak selection coefficients (0.1%) are effective. The selection coefficients they’re detecting (~1% or more) are strong enough to work even in small populations.
- The limiting factor is time, not population size. The Bronze Age provides ~3,000 years (about 120 generations) — enough time for compound effects of selection to become visible.
- By contrast, the Bhatia et al. study of African Americans (only ~5–10 generations since the transatlantic slave trade) found no detectable selection signals, likely because there simply hasn’t been enough time.

No fixed differences between modern humans and 50,000 years ago

The Mallick 2016 paper found no fixed genetic differences between modern humans and humans from 50,000 years ago — despite the “cognitive revolution” (art, symbolic behavior, complex tools) occurring around this time.
- This suggests the cognitive revolution may have been primarily cultural, not driven by key genetic mutations sweeping through the population.
- However, polygenic shifts (many small changes across many genes) could still have occurred without producing any single fixed difference.
Reich notes that anatomically modern humans appear in the fossil record around 300,000 years ago, and there do begin to be fixed differences at that time depth.

Why no farming before the Ice Age?

Genetically, humans 50,000+ years ago had all the cognitive and behavioral toolkit needed for farming — because descendants of that common ancestral population independently developed agriculture in multiple parts of the world (the Middle East, China, the Americas, New Guinea) after ~12,000 years ago.
- Yet no farming developed anywhere before the Holocene (the current warm, stable climatic period beginning ~12,000 years ago).
- Climate data shows the Holocene is anomalously stable on a scale of millions of years — less year-to-year, decade-to-decade, and century-to-century temperature variation than at almost any other time.
- Reich finds it “unbelievable” but apparently true that we live in a climatologically unique period, and this stability may have been a necessary precondition for agriculture.
- The fact that agriculture arose independently in very different environments (maize in the Americas, cereals in the Old World) at roughly the same time makes the climate explanation more compelling, though still surprising.

The Neanderthal puzzle

Reich is deeply puzzled by the relationship between Neanderthals, Denisovans, and modern humans.
- Genetically, Neanderthals and Denisovans are “sister groups” — they share a more recent common ancestor with each other (~500,000–600,000 years ago) than either does with modern humans (~700,000–800,000 years ago).
- But Neanderthals share many features with modern humans that Denisovans don’t appear to have: Levallois (Middle Stone Age) technology, similar mitochondrial DNA, and similar Y chromosomes.
- The Neanderthal mitochondrial DNA and Y chromosome are much more closely related to modern humans (~300,000–450,000 years ago) than the rest of their genome would suggest.
There was a known interbreeding event ~200,000–300,000 years ago in which modern humans contributed about 5% of DNA to Neanderthals. Reich is interested in the possibility that this 5% event was far more impactful than it appears.
- The fact that both the mitochondrial DNA AND the Y chromosome from this event rose to 100% frequency in Neanderthals is statistically improbable if it was truly only 5% — unless there was strong selection or social/cultural processes favoring them.

Reich’s alternative model for Neanderthal origins

Reich proposes a radical alternative: what if Neanderthals were, in a cultural sense, modern humans?
- A population in the Caucasus or Northeast Africa invented Levallois technology ~300,000–400,000 years ago and expanded in multiple directions.
- Into Europe: they mixed with local archaic humans, were ~95% genetically replaced by local DNA, but retained their cultural toolkit (and possibly their mitochondrial DNA and Y chromosome through matrilineal or patrilineal social structure).
- Into Africa: the same population mixed with local archaic groups, but here they were only ~20% replaced — because the local African archaics were much more diverged (~1.5 million years vs. ~700,000 years), creating greater biological incompatibilities and barriers to gene flow.
- In this model, Neanderthals and modern humans are both products of the same cultural/genetic revolution ~300,000 years ago — making them “close cousins” in a meaningful sense, even though genome-wide they cluster with Denisovans.
Evidence supporting this:
- The Sima de los Huesos site in Spain (~400,000 years old) has nuclear DNA that looks Neanderthal-like but mitochondrial DNA that looks Denisovan-like — suggesting a population replacement event where the incoming group’s mitochondrial DNA displaced the local one.
- The timing lines up: anatomically modern humans, recognizable Neanderthals, and the Middle Stone Age revolution all appear around the same time (~300,000 years ago).
Reich compares the current standard model to Ptolemy’s epicycles — a convoluted model that has been patched repeatedly to accommodate contradictory data. His alternative is simpler and explains more, though he acknowledges it’s probably wrong in details.

Why the alternative model faces resistance

The main barrier is that the genetic and archaeological communities have never been integrated in this way — the work on African substructure (from modern DNA) and the work on archaic human relationships (from ancient DNA) have remained separate fields.
- When you put them together, the timing of substructuring events lines up remarkably well.
The “implausible” implication of Reich’s model is that cultural and genetic transformations in Africa and Eurasia at ~300,000 years ago are linked — that the same revolutionary event produced both Neanderthals (culturally modern, mostly archaic genetically) and the ancestors of all living modern humans.
- This is analogous to the heliocentric model requiring acceptance that the stars are unimaginably far away — the implication seems implausible, but may simply be true.

How ancient DNA technology has transformed the field

The number of published ancient human genomes has gone from ~10 in 2010 to over 20,000 today — several orders of magnitude increase.
- Key drivers: millionfold drop in sequencing cost since the late 2000s; in-solution enrichment techniques that target informative positions in the genome, making it economically viable to sequence samples where human DNA is <1% (the rest being microbial); roboticization and industrialization of the lab process.
- Reich’s lab alone now generates genome-scale data from more than 5,000 individuals per year.
This explosion in data has made it possible to ask entirely new questions — like tracking allele frequency changes over time with enough resolution to detect natural selection — that were simply impossible a decade ago.

Summary

Why this study is different from previous work on natural selection

The methodology: how they detected selection

Key findings: the scale of selection

What traits are under selection

The Bronze Age shock: why selection intensified 5,000 years ago

The cognitive performance signal and what it might mean

Why wasn’t intelligence maxed out in the ancestral environment?

The thrifty gene hypothesis and body fat

Evolution is limited by time, not population size

No fixed differences between modern humans and 50,000 years ago

Why no farming before the Ice Age?

The Neanderthal puzzle

Reich’s alternative model for Neanderthal origins

Why the alternative model faces resistance

How ancient DNA technology has transformed the field