Video to Markdown for Journalists — Video Sources as Text
Press conferences livestream on YouTube and disappear into the algorithm. Video interviews on Zoom take 6-10 hours to transcribe manually. Broadcast footage you need to quote sits as MP4 with no searchable text. None of this works on a deadline. Paste the URL or upload the video to mdisbetter and the structured Markdown is back in minutes: each speaker labelled, every quote timestamped to the video for verification, the whole thing greppable across your source archive.
Why this is hard without the right tool
- Press conferences need fast transcription
- Video interviews need verbatim quotes
- Broadcast footage needs documentation
- Deadline pressure on video content
Recommended workflow
- For press conferences livestreamed on YouTube / Twitch / Twitter: paste the URL into /convert/video-to-markdown as soon as the stream ends
- For Zoom interviews you recorded: download the recording, upload the MP4
- For broadcast footage, leaked video, social-media video evidence: upload the file directly
- Convert — minutes per hour of video, not hours per hour
- Download the Markdown: speakers labelled, quotes timestamped, structured as
**Reporter:**/**Source:**exchanges - Use ctrl-F to find quotes by keyword; jump to the timestamp in your video player to verify the verbatim wording AND the speaker's non-verbal cues before publication
- Build a personal source-archive folder of
.mdtranscripts — searchable across every video source you've ever logged
Deadline workflow: from livestream end to filed story
Press conference ends at 3pm, deadline is 6pm. Old workflow: re-watch the recording at 1.5x for two hours pulling quotes manually, hit deadline with maybe four usable quotes from a 90-minute event. New workflow: paste the YouTube livestream URL into mdisbetter as the conference ends, get the structured transcript back in 5-10 minutes, ctrl-F for the topics relevant to your beat, pull twenty quotes with timestamps, verify each against the video before filing, hit deadline with depth. The transcription speed-up is the difference between covering one angle and covering five.
Verification discipline: never publish a quote you can't play back
The timestamps in the Markdown output ([12:34] next to each speaker turn) map back to the original video. Before any quote ships, jump to the timestamp in your video player and confirm both the verbatim wording AND the speaker's tone, expression, context. The transcript is a draft; the video is the source of truth. Treat the Markdown as a fast index into your video, not a replacement for it. This is the same discipline pre-AI tools required, just faster — you can verify 30 quotes in the time it used to take to transcribe one.
Multi-source video story workflow
When a story pulls from a press conference, three video interviews, and broadcast footage — six video sources across two weeks — an Obsidian vault of .md transcripts becomes a research workspace. Cross-reference quotes from different sources by topic. Build a timeline of who said what when. Use ripgrep across the whole vault to find every video source that mentioned a particular policy. None of this is possible with video files in a folder; all of it falls out for free once the transcripts are Markdown.
Privacy note for protected sources
For genuinely sensitive video sources — whistleblower video, off-the-record video interviews, leaked footage where any cloud upload is a serious risk — DO NOT use the mdisbetter web tool. Run whisper or faster-whisper entirely offline on your laptop after extracting audio from the video with ffmpeg locally. Same accuracy, zero network egress, no cloud-side processing. The web tool is the right speed/convenience tradeoff for the 90% of video sources where the source isn't at risk; for the 10% where it is, the OSS path keeps everything on your own hardware. This matters more for video than audio because video can identify a source's appearance, location, surroundings — much harder to anonymise than voice.
Cross-link to PDF source documents and webpages
Most investigative stories pull from PDFs (court filings, leaked memos) and webpages (press releases, archived posts) alongside video sources. Convert PDFs with /convert/pdf-to-markdown and store alongside video transcripts. Same vault, same searchable corpus, video quotes and document quotes side by side, all greppable. Format consistency across source types is what makes long investigations actually navigable.