Johnny Harris Reveals How He Writes YouTube Videos

How I Write 1h26 7 min #61
Johnny Harris Reveals How He Writes YouTube Videos
Watch on YouTube

Summary

  • Johnny Harris is one of YouTube’s most successful visual storytellers, known for making complex, wonky topics—like macroeconomics, geopolitics, and the history of Doritos—into beautiful, viral videos. In this episode, he walks through his entire creative process, from idea to publication, revealing how he balances craftsmanship with marketability, writing with visuals, and intellectual rigor with emotional resonance. The central tension he explores is how to make something both artistically excellent and widely popular—and his answer is that genuine enthusiasm, visual thinking, and deep empathy for the viewer are the foundation of everything.

The four-month production cycle

  • Each video takes roughly four months from idea to publication, broken into distinct phases:
    • Story development: Johnny and his team maintain a running list of ideas. When it’s time to greenlight a video, he writes a “reporting brief”—a set of questions he wants answered—often done informally, like on an airplane.
    • Research phase (3–4 weeks): A dedicated researcher becomes an expert on the topic, producing a 60–80 page “info doc”—a digest of research, visuals, and key facts. Johnny then processes this, wrestles with it, and begins outlining the story.
    • Scripting week (3–4 days): This is the core of the process. Johnny spends three to four days in deep, uninterrupted writing blocks (9 a.m. to 2 p.m.), during which he writes the script and visually directs the video simultaneously.
    • Post-production: After scripting, the video goes to a production kickoff, then through a series of cuts—“shame cut” (radio edit), rough cut, fine cut, audio lock, picture lock—with 150+ notes at each stage. Thumbnails and sponsorships are developed in parallel.
    • The team produces 30 videos a year, each averaging 28 minutes, with premium animation and custom music.

Writing and visual direction happen simultaneously

  • Johnny has never been a “text writer”—he has always written for motion on a screen. His script is a two-column document:
    • Left column: The words he will say (voiceover or on-camera).
    • Right column: Visual direction—what the viewer will see, including animation references, color codes, and notes for the animator.
  • He calls this process “coding” rather than writing, because he uses keyboard shortcuts and color macros (e.g., pressing Alt+A turns text red and creates a checkbox) to tag visual assets that need to be collected, all without touching a mouse.
  • Every sentence is paired with a specific visual action or animation. The two are inseparable—“they must dance together.”

The philosophy of “classic style” writing

  • Johnny’s writing is deeply influenced by Steven Pinker’s “classic style” (from The Sense of Style), which emphasizes:
    • Active language: “Who did what to whom”—favoring agents and verbs over abstract nouns. Example: “They fled” instead of “This led to the migration of 6,000 Mormons.”
    • Visual, concrete language: Avoiding jargon and conceptual vagueness in favor of things the viewer can mentally see.
    • Plain, conversational tone: Writing as if speaking to a smart friend, not performing intelligence for peers.
  • He also draws inspiration from The Economist (for quippy, concrete details), Yuval Noah Harari (for making wonky anthropology accessible), and John Green (for finding beauty in mundane things).

Empathy for the viewer drives every decision

  • Johnny constantly asks: “Who is watching this, and what is happening in their minds as this information hits them?”
    • Example: When writing about whether to mention Joseph Smith’s 40 wives, he worried about alienating Mormon viewers but ultimately included it because most of the audience would find it fascinating and it would hook them.
    • He avoids language that makes viewers feel defensive or excluded. Instead of performing expertise, he invites the viewer in: “Let me show you what this map says” rather than lecturing from above.
    • He references Cleo Abram’s principle: “Most people overestimate how much context the audience has and underestimate their intelligence.” Give them context, but treat them as smart.

Every story is a promise

  • The thumbnail and title are a promise to the viewer—a question the video will answer. Example: “Why is Saudi Arabia building a $1 trillion city in the desert?”
    • This promise must be reinforced within the first minute of the video and fulfilled by the end.
    • But Johnny also believes the promise is an entry point—once someone clicks, you can take them into a much bigger world they didn’t know they were curious about (“folding in the vegetables”).
  • He typically writes seven or more title options per video, testing for marketability. Most start with “how” or “why” because the channel is in the business of answering questions.

Balancing scripted and unscripted storytelling

  • Johnny’s approach to field work has evolved:
    • Early career: He “over-engineered” scripts, trying to record all explanations and facts on camera while in the field (e.g., Hong Kong). This was a nightmare—it prevented him from being present and wrestling with the story.
    • Now: Field work is experiential and reactive—character interactions, moments, emotions. He writes the prose after returning, in the voiceover booth, where he can craft elegant explanations without the pressure of performing on location.
    • He’s intentionally swinging back toward letting the story come to him in the field, rather than shoehorning it into a pre-built scaffold.

The two voices: explainer and poetic

  • Johnny distinguishes between two modes of writing:
    • Explainer voice: Active, precise, “who did what to whom”—used for the bulk of the video to move the story forward.
    • Poetic/contemplative voice: Less precise, more emotional, used in conclusions and “soul moments” (non-evibence visual interludes) to zoom out, reflect, and acknowledge complexity.
      • Example: “The people who live among the undetonated bombs and the abandoned bones that have little hope of ever being recognized.”
      • This voice uses concrete, evocative imagery (bombs next to bones) and juxtaposition (a beautiful valley / a sugarcoated history) to hold tension and gray area.
  • Switching between these “thinky” and “feely” tones—supported by music and visuals—makes the presentation more engaging and memorable.

Surprise and fresh language

  • Johnny believes good writing must be visual, surprising, and naturally interesting.
    • He actively avoids predictable, “canon” language—the way everyone else has described a topic. Instead, he climbs “up the wall of the canal” to find fresh, unorthodox ways to describe the same thing.
    • Example: The Antarctica video describes mapping through the lens of “pure beauty and awe,” as if an alien were arriving and marveling at how humans achieved something impossible.
    • This is harder than falling back on familiar phrasing, but it’s what makes a topic that could be boring feel thrilling.

Art direction is bespoke for every video

  • The channel has no fixed branding package. Each video gets its own art direction, defined in an “art ingredients” page (a Google Doc tab) that specifies:
    • How archival photos and video will be treated (e.g., “mysterious cryptic symbols,” specific framing).
    • Camera setup (A-cam, B-cam), lighting vibe (“cozy, warm”).
    • Animation style (e.g., “Chernobyl fuzz”—a Soviet 1980s TV look inspired by the HBO show).
    • Color palette, typography, set design, and mood board references.
  • Johnny writes the vibe in words (e.g., “Moody, some scientific posters in the background, think brains, practicals, analog tech, Natural History Museum-like clutter”), and a visual producer translates it into a mood board.

The writing block is sacred

  • Johnny’s entire life is structured around writing blocks (9 a.m. to 2 p.m., 3–4 days a week). During these blocks:
    • No meetings, no Slack, no email.
    • He listens to music (often from his composer’s library or synthwave mixes) and enters a manic, flow-state where he talks to himself and wrestles with the material.
    • He’s experimented with the schedule but found this window is non-negotiable—“it’s a little voodoo magic.”
  • The constraint of 3–4 days per script forces focus and prevents overthinking. The team used to make 48 videos a year on 3-day sprints but has scaled back to 30 to preserve quality and sanity.

Fact-checking is rigorous

  • Every assertion in every video is fact-checked and tagged with citations (academic articles, census data, etc.). Some claims have three or four independent sources.
  • The team publishes a source document for each video so viewers can scrutinize the research.
  • This process was learned “the hard way”—as production scaled, things started falling through the cracks.

Constraints sharpen creativity

  • Johnny credits the team’s constraints (tight timelines, limited resources) for sharpening their skills. “We can’t be precious, and so the constraints have added way more value than they’ve created frustration.”
  • The team is 25 people—mostly self-taught, scrappy editors, animators, and researchers. Johnny doesn’t hire from traditional TV; he hires people who are “up for anything.”
  • He sees himself as a bottleneck by design: if he’s not obsessed with a video, it won’t have magic. This puts a ceiling on output but ensures quality.

The frontier: inviting in more human stories

  • Johnny identifies his biggest growth area as inviting in characters and human stories rather than funneling everything through his own conceptual framework.
    • He’s historically emphasized understanding how things works over the people living them, but he’s starting to crave more empathy-driven storytelling.
    • He wants to go into the field and ask: “What is it really like to be you?”—not just collect sound bites, but build stories around other people’s experiences.

Craftsmanship over optimization

  • The team plays the “YouTube game” (thumbnail A/B testing, title optimization, data analysis) but craftsmanship is always #1.
    • They’ve gone down data-driven rabbit holes and extracted a few lessons, but it’s never become the heart of the strategy.
    • The strategy is: spread high-quality curiosity. There’s an audience for authentic, handmade work driven by a creator’s genuine passion.
  • Johnny believes human enthusiasm is the antidote to AI: just as humans still want to watch humans play chess even though computers are better, humans will always want to hear another human tell stories. “That’s the oldest human ritual, and I don’t think it’s going to go away.”

Teaching the next generation

  • If he were teaching a semester-long curriculum, Johnny would emphasize:
    • Spending time in the trenches learning visual technical skills (animation, Photoshop, Illustrator)—because learning to express something visually was the gateway to his writing.
    • Finding what you want to understand deeply, cultivating that, and trying to communicate it to someone else.
    • Putting in the reps: the work will be terrible at first, but pushing through that is the only way to reach a product that matches your vision.
    • The meta-lesson: care deeply. Find the thing you care about as much as Johnny cares about his stories, and let that drive the work.
Back to How I Write