What mdisbetter outputs today
Audio to Markdown produces structured transcripts with inline timestamps — typically [12:34] markers next to each speaker turn or topic shift, plus the actual text content. This is dramatically more useful than flat text for most workflows (show notes, content repurposing, search) but it's not SRT format directly. For SRT specifically, you need a conversion step from our Markdown output.
From mdisbetter Markdown to SRT
The conversion is mechanical: each timestamp + following text chunk becomes one SRT cue. A simple Python script (10-15 lines using a Markdown parser to extract timestamp/text pairs and emitting SRT format) handles it cleanly. Or paste the Markdown into ChatGPT/Claude with "convert this timestamped Markdown transcript to SRT subtitle format with 3-second cues" — works for most files in one pass.
For direct SRT generation, use OSS Whisper
OpenAI's open-source whisper command-line tool generates SRT directly: whisper input.mp3 --output_format srt produces the SRT file as part of normal operation. Same Whisper-class model as the major commercial transcription services. MIT-licensed, runs locally. For one-off SRT generation jobs, this is the simplest path. For larger batch SRT work, faster-whisper is the speed-optimised variant.
SRT export coming to mdisbetter
Direct SRT output is on our roadmap — currently the Markdown-with-timestamps output handles the underlying need for most users (transcription with timing data), and the conversion to SRT is mechanical for those who need it. As demand justifies, we'll add a one-click SRT export option to the audio converter UI.