Coming soon — what we're building
Markdown to Audio is the application of text-to-speech specifically to Markdown documents — converting your .md files into MP3 or WAV audio for listening. Use cases: listening to your Obsidian / Notion notes during commutes, audio version of long-form blog posts you've drafted in Markdown, accessibility (audio rendering of written documentation), turning written research into listenable form for review while doing other things, generating audiobook-style content from Markdown source material. We're building this. Today the audio direction we support is the inverse: transcription via Audio to Markdown.
What's special about Markdown→Audio vs generic TTS
The structural cues in Markdown should drive the audio output: H1/H2 headings should pause for slightly longer than paragraph breaks, code blocks should be either skipped or read with prefix/suffix audio cues ("code block: ... end code block"), tables should be summarised rather than read row-by-row (which would be unbearable), inline emphasis (**bold**, *italic*) should be reflected in vocal stress when the TTS engine supports it. Generic TTS doesn't know about Markdown structure; a Markdown-specific tool does.
What's the timeline?
Honest answer: this is downstream of our text-to-speech base feature, which itself is on the roadmap as secondary to the transcription direction we're focused on. If you have a strong use case for Markdown→audio specifically, let us know — concrete demand moves it up the roadmap. In the meantime, OSS alternatives cover the gap.
OSS alternatives if you need this today
- Coqui TTS + a small script to strip Markdown markers before feeding to TTS — DIY but free and local.
- Real-Time Voice Cloning for custom voices on top of TTS.
- Commercial routes: pipe Markdown through pandoc to plain text, then through ElevenLabs or OpenAI TTS API for high-quality output.
- Natural Reader — a consumer TTS app that handles documents, has a free tier, works for casual document-to-audio needs.
For the inverse direction (which we DO support)
If you have audio and want the Markdown text out of it (transcription), that's our core tool: Audio to Markdown. Upload any audio file, get structured Markdown back with speakers labelled, topics as H2 sections, timestamps inline. The full transcription workflow we've built for podcasters, journalists, researchers, and many others.