May 10, 2026 · 9 min read · MDisBetter

How to Transcribe a Zoom Meeting Recording (Free Guide)

Zoom built transcription into the platform — but only on paid Business and Enterprise tiers. For everyone on Pro or Free, the official path is "upgrade your plan or live without transcripts." There is a free path that produces equally clean (often better) transcripts: record locally or to the cloud as you already do, download the file, run it through MDisBetter or local Whisper, get a structured Markdown transcript with speaker labels in 2-3 minutes. Here is exactly how, plus the privacy alternative for sensitive meetings.

Zoom's built-in options (and their limits)

For context, here's what Zoom natively offers in 2026:

Plan	Live transcription / closed captions	Post-meeting transcript	AI Companion summary
Free	No	No	No
Pro	No	No	Limited
Business	Yes	Yes (with cloud recording)	Yes
Enterprise	Yes	Yes	Yes

If you're on Pro or Free, the official transcription path is closed. The workaround in this guide opens it up at $0/meeting cost.

Step 1: Get the recording file

Two paths depending on your Zoom setup.

Cloud recording (Pro and above)

Sign into the Zoom web portal at zoom.us
Go to Recordings → Cloud Recordings
Find the meeting, click the meeting name
Download the Shared screen with speaker view file (MP4) — this is the file with audio you want

Local recording (any plan, including Free)

If recording was set to Local during the meeting, the files were saved on the host's machine. Default location:

Mac: ~/Documents/Zoom/[Meeting folder]/
Windows: C:\Users\[user]\Documents\Zoom\[Meeting folder]\

Inside the folder, you'll find:

zoom_0.mp4 — the full video recording
audio_only.m4a — audio-only version (smaller, faster to upload)
chat.txt — chat log if anyone typed in the chat

For transcription, the M4A audio-only file is what you want — no point uploading 500MB of video when 30MB of audio gives you the same transcript.

Step 2: Upload to MDisBetter

Open video to Markdown. Upload the M4A (or MP4 if you only have the video). Click Convert. Wait 1-3 minutes for a 30-60 minute meeting.

What you get:

# Meeting Recording 2026-05-10

**Duration:** 47:18

## [00:00] Opening / agenda check

**Speaker 1:** Okay, let's get started. Today we're covering the Q3 launch
and the hiring update.

**Speaker 2:** Sounds good. Should we add the budget review at the end?

**Speaker 1:** Yes, let's bookend with that.

## [03:14] Q3 launch plan

**Speaker 1:** Where are we on the launch landing page?

**Speaker 2:** It's about 70% there. I'll have a draft by Wednesday.

## [12:42] Hiring update

...

Speaker labels are auto-detected (diarization). For 2-4 speakers with reasonable mic separation (each person on their own laptop/headset), accuracy is high. For 5+ speakers or shared-mic setups (multiple people in a conference room on one device), diarization degrades — speakers blur together.

Step 3: Use the transcript

Three immediate uses, depending on what the meeting was for.

Action items

Run the prompt from our action items guide. The structured transcript is exactly what the prompt needs — speaker labels and timestamps included. Output is a per-owner action list ready to drop into Slack.

Searchable archive

Save the .md file in your team's knowledge base (Notion, Obsidian, Google Drive, Confluence). 6 months later when someone asks "what did we decide about X," the answer is searchable.

Meeting recap email

Run a separate prompt:

Below is a meeting transcript. Write a meeting recap email:
- Subject: Recap — [meeting name]
- 1-paragraph summary of what was discussed
- Decisions made (bulleted, with who decided)
- Action items (per owner, with deadlines if mentioned)
- Open questions / parked items
- Next meeting date if mentioned

Transcript: [PASTE]

Send to all attendees. Done.

Privacy alternative: Whisper local

For confidential meetings (legal, HR, financial, M&A discussions), uploading to a third-party service may not be acceptable. The local Whisper path keeps everything on your machine.

Setup

pip install faster-whisper

Transcribe a Zoom recording locally

from faster_whisper import WhisperModel
from pathlib import Path

# Use 'medium' if no GPU; 'large-v3' if GPU
model = WhisperModel('large-v3', device='auto', compute_type='int8')

audio_path = 'audio_only.m4a'
segments, info = model.transcribe(audio_path, beam_size=5)

lines = ['# Meeting Recording', '']
for s in segments:
    mm, ss = divmod(int(s.start), 60)
    lines.append(f'[{mm:02d}:{ss:02d}] {s.text.strip()}')

Path('transcript.md').write_text('\n'.join(lines), encoding='utf-8')

This produces a flat transcript with timestamps. For speaker labels (diarization), add WhisperX as in our batch guide. For the action-item extraction, use a local LLM (Ollama running Llama 3 or Mistral) instead of cloud Claude/ChatGPT — keeps the entire pipeline private.

Comparison: 5 ways to transcribe a Zoom meeting

Method	Cost	Privacy	Quality	Setup
Zoom built-in (Business+)	~$20/host/mo	Zoom servers	Good	Already on plan
MDisBetter web tool	Free tier	Server-side processing	Excellent (diarized)	None
Otter / Fireflies bot	~$10-30/seat/mo	Vendor servers	Excellent	Calendar integration
Local Whisper	Free	Fully local	Excellent (with WhisperX)	Python + GPU pref
Manual typing	Free + 4-6 hrs	Fully local	Variable	None

The right choice depends on your situation. For most one-off post-meeting transcription, MDisBetter is the path of least resistance. For recurring meeting workflows with team distribution, Otter/Fireflies are purpose-built. For privacy-critical, Whisper local. The Zoom built-in is good if you're already on the plan, but most users on Pro/Free are who this guide is for.

Common pitfalls

Uploading the video instead of the audio

The MP4 from Zoom can be hundreds of MB. The M4A audio-only file is 5-10x smaller and gives identical transcription quality (the audio is the same; the video isn't transcribed). Always grab the audio_only file when both are available.

Forgetting to enable recording

The most common failure mode. Build a habit: at the start of any meeting that matters, the host clicks Record before any substantive content starts. Some teams script this with a Slack reminder bot.

Bad audio quality kills accuracy

The biggest quality factor isn't the transcription tool, it's the audio source. Tips:

Each participant on their own device with their own headset/mic — best
Conference room with multiple participants on one device — accuracy drops, speaker labels blur
Outdoor / cafe participants on AirPods — workable but noticeable
Speakerphone in a noisy room — accuracy can drop to 80%

Sensitive content uploaded to cloud anyway

If your team policy or industry regulation forbids cloud transcription of meeting content (HIPAA, attorney-client privilege, M&A discussions), use the local Whisper path even though it's slower to set up. The compliance posture is binary, not gradual.

What about Microsoft Teams and Google Meet?

Same workflow with different file paths.

Microsoft Teams: recording goes to OneDrive/SharePoint. Download the MP4 from there. Same upload-to-converter step.

Google Meet: recording (paid Workspace tiers only) goes to the host's Google Drive. Download the MP4. Same upload step.

Both produce MP4 files compatible with the converter. The transcript output and downstream prompts are identical.

For real-time / live captioning

None of the methods above produce live captions during the meeting — they're all post-meeting workflows. For live in-meeting captioning, the right tools are: Zoom's built-in (Business+), Otter Live, Tactiq's Chrome extension for Meet/Teams (for accessibility), or assistive tech specifically built for live speech-to-text. The post-meeting upload pattern trades real-time for cost and privacy.

Storage and lifecycle

After transcription, decide what to keep:

Always keep: the .md transcript (small, searchable, the canonical artifact going forward)
Consider keeping: the audio file (in case you need to re-transcribe with a better model later, or verify a disputed quote)
Probably delete: the video file (huge, rarely re-watched)

For sensitive meetings, schedule auto-deletion of recordings after the transcript is verified.

Recommendation

Next Zoom meeting that matters: record it (cloud or local), download the audio_only.m4a after, drop into video to Markdown. 5 minutes of post-meeting work, you have a structured transcript and (with the action-item prompt) a clean recap email ready to send. For the broader workflow including action items, see extract action items from meeting recordings; for batch and OSS approaches see batch transcribe a YouTube playlist (same Whisper stack); for the audio-only equivalent (phone calls, conference calls without video), use audio to Markdown.

Frequently asked questions

Can MDisBetter join my Zoom meeting as a bot to transcribe live?

No. MDisBetter is a post-meeting upload tool, not a meeting bot. The workflow is: use Zoom's built-in record-to-cloud or record-to-local feature, download the resulting M4A or MP4 after the meeting ends, upload to the converter. For real-time bots that auto-join your scheduled meetings, look at Otter, Fireflies, or Granola — that's their domain, not ours, and they're worth the subscription if your job is back-to-back meetings.

What if Zoom's recording is split across multiple files?

Long meetings sometimes get split into chunks (Zoom defaults to 1 GB per file for cloud recording). Two options: (1) concatenate the audio files first using ffmpeg (ffmpeg -i "concat:file1.m4a|file2.m4a" -c copy combined.m4a), then upload the combined file — best for action-item extraction since the AI sees the whole context; (2) transcribe each chunk separately and stitch the .md files manually. Option 1 is cleaner.

How do I handle meetings where some participants didn't consent to recording?

Don't transcribe those meetings, or get explicit consent before processing. Most jurisdictions require all-party consent for recording, and processing the recording (transcription) inherits the same consent requirements. If a participant later objects, the cleanest move is to delete both the recording and any derived transcripts. For meetings where consent is established, mention at the start of the call that transcription will happen — most participants are fine with it once they know.