How to Get Subtitles from Any YouTube Video (Free Methods)
SRT and VTT subtitles are useful for accessibility overlays, video editors, multi-language workflows, and offline viewing. They are also a notoriously fiddly thing to extract from YouTube — the official path is hidden, the third-party tools come and go, and even when you get the file, the subtitle format may not be what you actually need. Here are the working free methods in 2026, plus the honest discussion of when subtitles are not what you want and a real transcript is.
Subtitles vs. transcript — pick the right thing
Before downloading anything, decide which artifact you actually need.
- SRT / VTT subtitles are time-coded chunks of 1-3 second display text designed to overlay on video playback. Each cue has a start and end timestamp and 1-3 lines of caption text. Format example:
00:01:23,400 --> 00:01:25,800This is what the speaker saysacross two short lines. - A transcript is the full prose text of what was spoken, usually with paragraph breaks, sentence-level punctuation, and (in good versions) speaker labels and section headings. Format example:
**Sarah Chen**: This is what the speaker says, in a complete punctuated sentence with a paragraph break at the end of the thought.
Subtitles are for displaying caption overlays. Transcripts are for reading, searching, AI input, and any work that treats spoken content as text. If you are downloading "subtitles" because you want to read what was said, you almost certainly want a transcript instead.
When you genuinely want SRT/VTT
- Adding captions to your own video edit (your video editor wants SRT/VTT input).
- Generating multi-language subtitle tracks for accessibility.
- Building a video player overlay where caption timing matters.
- Re-uploading a video to a platform that wants caption files separately.
When you actually want a transcript
- Reading what was said.
- Pasting into ChatGPT or Claude.
- Building searchable notes.
- Repurposing a video into blog/social/newsletter.
- Studying from a lecture.
The methods below cover both — start with what you need.
Method 1: YouTube built-in subtitle download
The official path is convoluted and not exposed in the standard viewer interface. The fastest route through it:
- Open the video on desktop YouTube.
- Click Show transcript from the three-dot menu under the video.
- The transcript panel does not give you a direct SRT download. To get the file format, you need a workaround.
- If you uploaded the video, YouTube Studio → Subtitles → click the language → Download SRT or VTT. (This only works for videos you own.)
- For videos you do not own, use the URL trick: open
https://video.google.com/timedtext?lang=en&v=VIDEO_IDin a browser to fetch the raw caption XML, then convert to SRT with a small script.
This path is increasingly fragile — YouTube has been deprecating the public timedtext endpoints. For 2026 the timedtext URL works inconsistently across regions and video types. Most people skip this method and use yt-dlp instead.
Method 2: yt-dlp (the reliable open-source option)
yt-dlp is the actively-maintained successor to youtube-dl and the most reliable way to download YouTube subtitles in any format. Free, open source, runs locally.
Install
# Python
pip install -U yt-dlp
# Or via package manager
brew install yt-dlp # macOS
sudo apt install yt-dlp # Debian/Ubuntu
winget install yt-dlp # WindowsDownload manual subtitles (if creator-uploaded)
# List available subtitle languages
yt-dlp --list-subs "https://www.youtube.com/watch?v=VIDEO_ID"
# Download English subtitles as SRT
yt-dlp --write-sub --sub-lang en --skip-download \
--sub-format srt \
"https://www.youtube.com/watch?v=VIDEO_ID"Download auto-generated subtitles
# Auto-generated captions only
yt-dlp --write-auto-sub --sub-lang en --skip-download \
--sub-format srt \
"https://www.youtube.com/watch?v=VIDEO_ID"Download both manual and auto, all languages
yt-dlp --write-sub --write-auto-sub --sub-lang "all" \
--skip-download --sub-format "srt/vtt" \
"https://www.youtube.com/watch?v=VIDEO_ID"Convert VTT to SRT after download
If yt-dlp gives you VTT and you want SRT, FFmpeg handles it:
ffmpeg -i input.vtt output.srtPros
- Free, open source, actively maintained.
- Reliable — survives most YouTube interface changes.
- Handles every video that has a caption track.
- Can pull subtitles for an entire playlist or channel in one command.
Cons
- CLI tool — not friendly for non-technical users.
- Inherits YouTube's auto-caption accuracy problems (15-20% WER on technical content).
Method 3: Third-party subtitle download sites
downsub.com, savesubs.com, downloadyoutubesubtitles.com and similar tools wrap YouTube's caption API in a one-click web interface.
How to use
- Paste the YouTube URL into the tool.
- Click Download.
- Pick SRT or VTT format.
- Download the file.
Pros
- No install, no CLI, works on any device.
- Often includes one-click downloads for multiple languages.
- Some offer auto-translation to other languages.
Cons
- Reliability comes and goes — sites break when YouTube changes its caption API.
- Privacy varies — some log all submissions.
- Heavy ads.
- Same accuracy ceiling as the underlying YouTube caption track.
Method 4: mdisbetter for full transcript (not just captions)
If what you actually want is a readable, accurate, structured transcript — not a caption overlay file — the right tool is /convert/video-to-markdown (or, for YouTube specifically, /convert/youtube-video-to-markdown).
How to use
- Paste the YouTube URL.
- Click Convert.
- Wait 60-120 seconds.
- Download the
.mdfile.
What you get vs. SRT
- 96-98% word accuracy on the audio (vs. 84-86% in YouTube auto-captions). Re-transcribed from the audio, not pulled from YouTube's existing caption track.
- H2 section breaks at topic shifts, with timestamps in the heading:
## [12:34] Pricing strategy. - Speaker labels where multiple voices are detected.
- Real punctuation and proper sentence boundaries.
- Markdown format — readable in any text editor, indexable by Spotlight/Obsidian, AI-ready out of the box.
What you do not get: a video-player-compatible SRT/VTT file. If you need the timing-coded overlay format, use yt-dlp; if you need readable text, use mdisbetter.
Quick comparison table
| Method | Format | Accuracy | Setup | Speed |
|---|---|---|---|---|
| YouTube built-in (own videos) | SRT/VTT | YouTube ASR | None | Instant |
| yt-dlp | SRT/VTT | YouTube ASR | CLI install | ~10s |
| Third-party sites | SRT/VTT/TXT | YouTube ASR | None | ~10s |
| mdisbetter | Markdown transcript | 96-98% | None | 60-120s |
Multi-language workflows
If you need subtitles in multiple languages, the cleanest workflow:
- Download the original-language subtitles via yt-dlp (most accurate timing).
- Translate the SRT file to your target languages using a translation tool that preserves SRT timing — Subtitle Edit (free, Windows), the Whisper translate-to-English mode, or a pass through Claude/GPT with the SRT format preserved.
- Verify the timing on a sample chunk before bulk-translating.
For a transcript-first workflow (read in original language, then translate the prose), pull the structured Markdown from mdisbetter and feed it to your AI of choice with a translation prompt. The H2 structure and timestamps survive the translation cleanly.
What about region-locked or age-gated videos?
yt-dlp handles most region locks and age gates with the --cookies-from-browser flag, which uses your local browser's cookies to authenticate the request:
yt-dlp --cookies-from-browser chrome \
--write-sub --sub-lang en --skip-download \
"https://www.youtube.com/watch?v=VIDEO_ID"This works because yt-dlp is just acting as your browser would. If you can view the video while logged in, yt-dlp can fetch its captions while authenticated as you.
Subtitles vs. transcripts: the honest summary
For accessibility overlays and video editor input, SRT/VTT subtitles are the right format and yt-dlp is the right tool. For everything else — reading, searching, AI input, study notes, content repurposing — a structured Markdown transcript is the right format and mdisbetter is the right tool. The two are complementary, not competitive. Most people who initially ask "how do I download YouTube subtitles" actually want the second category once they realize the first is just timing-coded fragments of low-accuracy text.
For deeper context on the transcript side, see how to download a YouTube transcript. For the AI workflow specifically, see ChatGPT can't watch your YouTube video. For the broader "my video content is invisible to AI" pattern, see your YouTube videos are invisible to AI. For non-YouTube video sources, see how to get a transcript from Vimeo and how to transcribe a TikTok video.
SRT format basics for video editors
If you are working with the downloaded SRT in a video editor (Premiere, DaVinci Resolve, Final Cut, CapCut, descript), the file imports as a captions track. Each cue's timing is preserved automatically. You can edit the text inline in the editor's caption panel — fixing mishearings, adjusting line breaks for readability, restyling the on-screen presentation. For burnt-in caption styling (matching the Mr. Beast / TikTok visual treatment), most editors let you apply caption styles globally to the imported SRT track.
Common SRT pitfalls
- Encoding. SRT files should be UTF-8 to handle non-ASCII characters (em-dashes, smart quotes, accented letters). Some older Windows tools default to Windows-1252 which corrupts these characters silently.
- Line break length. Subtitle best practice is max 42 characters per line, max 2 lines per cue. Auto-generated YouTube SRTs often violate this.
- Cue duration. Subtitles should display for at least 1 second and ideally not more than 6 seconds. Auto-generated cues are usually fine; manually-edited ones can drift.
Quick UTF-8 conversion
# Re-save an SRT as UTF-8
iconv -f WINDOWS-1252 -t UTF-8 input.srt > output.srt
# Or in Python
with open("input.srt", encoding="cp1252") as f: data = f.read()
with open("output.srt", "w", encoding="utf-8") as f: f.write(data)For most modern workflows the UTF-8 default is fine. The conversion above only matters when consuming legacy files.
One last note on the difference between caption tracks and transcripts
Three months from now, when you have a folder of SRT files and a folder of Markdown transcripts of the same videos, you will use them differently. The SRT files will sit in your video editor's media bin and only get touched when you are working on a specific video edit. The Markdown transcripts will be in your daily knowledge base, getting searched, quoted, and AI-queried regularly. The two artifacts serve fundamentally different needs and the right answer is to have both for any video that is important enough to invest in.