Pricing Dashboard Sign up
Recent
· 10 min read · MDisBetter

How to Get Subtitles from Any YouTube Video (Free Methods)

SRT and VTT subtitles are useful for accessibility overlays, video editors, multi-language workflows, and offline viewing. They are also a notoriously fiddly thing to extract from YouTube — the official path is hidden, the third-party tools come and go, and even when you get the file, the subtitle format may not be what you actually need. Here are the working free methods in 2026, plus the honest discussion of when subtitles are not what you want and a real transcript is.

Subtitles vs. transcript — pick the right thing

Before downloading anything, decide which artifact you actually need.

Subtitles are for displaying caption overlays. Transcripts are for reading, searching, AI input, and any work that treats spoken content as text. If you are downloading "subtitles" because you want to read what was said, you almost certainly want a transcript instead.

When you genuinely want SRT/VTT

When you actually want a transcript

The methods below cover both — start with what you need.

Method 1: YouTube built-in subtitle download

The official path is convoluted and not exposed in the standard viewer interface. The fastest route through it:

  1. Open the video on desktop YouTube.
  2. Click Show transcript from the three-dot menu under the video.
  3. The transcript panel does not give you a direct SRT download. To get the file format, you need a workaround.
  4. If you uploaded the video, YouTube Studio → Subtitles → click the language → Download SRT or VTT. (This only works for videos you own.)
  5. For videos you do not own, use the URL trick: open https://video.google.com/timedtext?lang=en&v=VIDEO_ID in a browser to fetch the raw caption XML, then convert to SRT with a small script.

This path is increasingly fragile — YouTube has been deprecating the public timedtext endpoints. For 2026 the timedtext URL works inconsistently across regions and video types. Most people skip this method and use yt-dlp instead.

Method 2: yt-dlp (the reliable open-source option)

yt-dlp is the actively-maintained successor to youtube-dl and the most reliable way to download YouTube subtitles in any format. Free, open source, runs locally.

Install

# Python
pip install -U yt-dlp

# Or via package manager
brew install yt-dlp        # macOS
sudo apt install yt-dlp    # Debian/Ubuntu
winget install yt-dlp      # Windows

Download manual subtitles (if creator-uploaded)

# List available subtitle languages
yt-dlp --list-subs "https://www.youtube.com/watch?v=VIDEO_ID"

# Download English subtitles as SRT
yt-dlp --write-sub --sub-lang en --skip-download \
  --sub-format srt \
  "https://www.youtube.com/watch?v=VIDEO_ID"

Download auto-generated subtitles

# Auto-generated captions only
yt-dlp --write-auto-sub --sub-lang en --skip-download \
  --sub-format srt \
  "https://www.youtube.com/watch?v=VIDEO_ID"

Download both manual and auto, all languages

yt-dlp --write-sub --write-auto-sub --sub-lang "all" \
  --skip-download --sub-format "srt/vtt" \
  "https://www.youtube.com/watch?v=VIDEO_ID"

Convert VTT to SRT after download

If yt-dlp gives you VTT and you want SRT, FFmpeg handles it:

ffmpeg -i input.vtt output.srt

Pros

Cons

Method 3: Third-party subtitle download sites

downsub.com, savesubs.com, downloadyoutubesubtitles.com and similar tools wrap YouTube's caption API in a one-click web interface.

How to use

  1. Paste the YouTube URL into the tool.
  2. Click Download.
  3. Pick SRT or VTT format.
  4. Download the file.

Pros

Cons

Method 4: mdisbetter for full transcript (not just captions)

If what you actually want is a readable, accurate, structured transcript — not a caption overlay file — the right tool is /convert/video-to-markdown (or, for YouTube specifically, /convert/youtube-video-to-markdown).

How to use

  1. Paste the YouTube URL.
  2. Click Convert.
  3. Wait 60-120 seconds.
  4. Download the .md file.

What you get vs. SRT

What you do not get: a video-player-compatible SRT/VTT file. If you need the timing-coded overlay format, use yt-dlp; if you need readable text, use mdisbetter.

Quick comparison table

MethodFormatAccuracySetupSpeed
YouTube built-in (own videos)SRT/VTTYouTube ASRNoneInstant
yt-dlpSRT/VTTYouTube ASRCLI install~10s
Third-party sitesSRT/VTT/TXTYouTube ASRNone~10s
mdisbetterMarkdown transcript96-98%None60-120s

Multi-language workflows

If you need subtitles in multiple languages, the cleanest workflow:

  1. Download the original-language subtitles via yt-dlp (most accurate timing).
  2. Translate the SRT file to your target languages using a translation tool that preserves SRT timing — Subtitle Edit (free, Windows), the Whisper translate-to-English mode, or a pass through Claude/GPT with the SRT format preserved.
  3. Verify the timing on a sample chunk before bulk-translating.

For a transcript-first workflow (read in original language, then translate the prose), pull the structured Markdown from mdisbetter and feed it to your AI of choice with a translation prompt. The H2 structure and timestamps survive the translation cleanly.

What about region-locked or age-gated videos?

yt-dlp handles most region locks and age gates with the --cookies-from-browser flag, which uses your local browser's cookies to authenticate the request:

yt-dlp --cookies-from-browser chrome \
  --write-sub --sub-lang en --skip-download \
  "https://www.youtube.com/watch?v=VIDEO_ID"

This works because yt-dlp is just acting as your browser would. If you can view the video while logged in, yt-dlp can fetch its captions while authenticated as you.

Subtitles vs. transcripts: the honest summary

For accessibility overlays and video editor input, SRT/VTT subtitles are the right format and yt-dlp is the right tool. For everything else — reading, searching, AI input, study notes, content repurposing — a structured Markdown transcript is the right format and mdisbetter is the right tool. The two are complementary, not competitive. Most people who initially ask "how do I download YouTube subtitles" actually want the second category once they realize the first is just timing-coded fragments of low-accuracy text.

For deeper context on the transcript side, see how to download a YouTube transcript. For the AI workflow specifically, see ChatGPT can't watch your YouTube video. For the broader "my video content is invisible to AI" pattern, see your YouTube videos are invisible to AI. For non-YouTube video sources, see how to get a transcript from Vimeo and how to transcribe a TikTok video.

SRT format basics for video editors

If you are working with the downloaded SRT in a video editor (Premiere, DaVinci Resolve, Final Cut, CapCut, descript), the file imports as a captions track. Each cue's timing is preserved automatically. You can edit the text inline in the editor's caption panel — fixing mishearings, adjusting line breaks for readability, restyling the on-screen presentation. For burnt-in caption styling (matching the Mr. Beast / TikTok visual treatment), most editors let you apply caption styles globally to the imported SRT track.

Common SRT pitfalls

Quick UTF-8 conversion

# Re-save an SRT as UTF-8
iconv -f WINDOWS-1252 -t UTF-8 input.srt > output.srt

# Or in Python
with open("input.srt", encoding="cp1252") as f: data = f.read()
with open("output.srt", "w", encoding="utf-8") as f: f.write(data)

For most modern workflows the UTF-8 default is fine. The conversion above only matters when consuming legacy files.

One last note on the difference between caption tracks and transcripts

Three months from now, when you have a folder of SRT files and a folder of Markdown transcripts of the same videos, you will use them differently. The SRT files will sit in your video editor's media bin and only get touched when you are working on a specific video edit. The Markdown transcripts will be in your daily knowledge base, getting searched, quoted, and AI-queried regularly. The two artifacts serve fundamentally different needs and the right answer is to have both for any video that is important enough to invest in.

Frequently asked questions

Can I get subtitles from a YouTube Short?
Sometimes. YouTube Shorts often have auto-generated captions but the public 'Show transcript' panel does not always render for Shorts. yt-dlp works on Shorts URLs the same way as regular videos. For the most reliable result on a Short with no caption track, transcribe the audio directly via mdisbetter — the audio extraction handles Shorts URLs identically to regular videos.
Will downloaded YouTube subtitles include speaker labels?
No. YouTube's caption track does not contain speaker information — it is a single stream of text regardless of how many people are speaking. To get speaker labels, you need to re-transcribe the audio with a diarization-capable tool. The mdisbetter video-to-markdown output includes speaker labels where multiple voices are detected; the local WhisperX route also produces them.
What's the difference between SRT and VTT?
Both are caption file formats with similar timing structure. SRT (SubRip) is older, simpler, supported by virtually every video editor and player. VTT (WebVTT) is newer, supports styling and positioning metadata, and is the native format for HTML5 video. For most workflows the formats are interchangeable — yt-dlp can output either, and ffmpeg converts between them losslessly: 'ffmpeg -i input.vtt output.srt'.