May 10, 2026 · 9 min read · MDisBetter

How to Convert Word to Markdown (5 Methods Compared, 2026)

You have a Word document — maybe a draft, a contract, a technical spec, an old report — and you want it in Markdown. Maybe to feed it to Claude or ChatGPT, drop it into Obsidian, push it to a GitHub repo, or import it into Notion. Word's binary-flavored XML is the wrong format for almost every modern workflow except editing in Word itself. Markdown is plain text, version-controllable, LLM-friendly, and renders everywhere. Here are the five real ways to make the jump in 2026, with honest trade-offs.

The five methods at a glance

Method	Setup	Best for	Quality	Cost
1. MDisBetter web tool	None	One-off conversions, anyone	~96%	Free
2. Pandoc CLI	Install Pandoc	Power users, batch, complex docs	~98%	Free (OSS)
3. Mammoth.js library	npm install	Developers building tooling	~95%	Free (OSS)
4. Word native export	None (Word installed)	Quick HTML pivot	~70%	Free if you have Word
5. Copy-paste	None	Almost nothing	~30%	Free

Method 1: The MDisBetter web tool (easiest)

Open the Word to Markdown converter, drop your .docx file on the page, click Convert, download the .md. Three clicks, zero setup, no account required. The output is GitHub-Flavored Markdown — H1 to H6 preserved, ordered and unordered lists with proper nesting, tables converted to pipe syntax, images extracted with alt text, bold/italic/code spans intact.

This is the right method for 90% of users. Single document, occasional use, no install. The web tool is one file at a time — for batch needs, drop down to Pandoc (Method 2).

Pros

Zero install, zero learning curve
Works on any OS, any browser
Free, no signup
Handles 95%+ of standard Word features cleanly

Cons

One file at a time (no batch)
Requires uploading the file (not local)
No customisation of conversion rules

Method 2: Pandoc CLI (most powerful)

Pandoc is the gold-standard universal document converter. Open source (GPL), maintained for over 15 years, ships with reliable .docx parsing. If you're a developer, sysadmin, or technical writer with more than a handful of files, this is the right tool.

Install on macOS:

brew install pandoc

On Ubuntu/Debian:

sudo apt-get install pandoc

On Windows, grab the installer from pandoc.org.

Convert a single file:

pandoc -f docx -t gfm input.docx -o output.md

The -f docx flag declares the source format, -t gfm targets GitHub-Flavored Markdown (the de facto modern standard). For images, add --extract-media=./media to dump embedded images into a folder:

pandoc -f docx -t gfm --extract-media=./media input.docx -o output.md

Batch convert every .docx in a folder:

for f in *.docx; do
  pandoc -f docx -t gfm --extract-media=./media "$f" -o "${f%.docx}.md"
done

Pros

Highest quality on complex documents (footnotes, citations, equations)
Scriptable, batchable, automatable
Free, no quota, no internet required
Handles every doc format under the sun

Cons

Requires install + command line
Steep learning curve for advanced flags
No GUI

Method 3: Mammoth.js library (for developers)

Mammoth.js is an MIT-licensed JavaScript library for converting .docx to HTML and Markdown. Use it when you're building a Node.js or browser tool that needs in-process Word conversion.

// npm install mammoth
const mammoth = require('mammoth');

mammoth.convertToMarkdown({ path: 'input.docx' })
  .then(result => {
    console.log(result.value); // the Markdown
    console.log(result.messages); // any warnings
  });

Mammoth has a slightly different philosophy from Pandoc: it focuses on semantic content (paragraphs, lists, tables, headings) and ignores Word's presentational fluff (font sizes, spacing, page breaks). The result is cleaner Markdown but loses some formatting that Pandoc preserves.

Pros

In-process JS — perfect for Node.js servers and browser apps
Clean semantic output
Custom style mapping (map specific Word styles to specific HTML/MD)

Cons

JavaScript only
Less powerful than Pandoc on edge cases
No CLI out of the box

Method 4: Word native export (limited)

Word can export to HTML or plain text natively. Neither is Markdown, but both are stepping stones.

Export to HTML: File → Save As → Web Page (.html). Then run the HTML through any HTML-to-Markdown converter. The HTML is famously bloated — Word generates inline styles, MSO-specific tags, and font declarations everywhere. You'll spend more time cleaning the HTML than just using a real tool.

Export to plain text: File → Save As → Plain Text (.txt). You lose every bit of formatting — headings become paragraphs, tables become tab-separated lines, lists become indented text. Useless for anything but the simplest documents.

Pros

No external tool needed
Works offline

Cons

HTML output is bloated and needs significant post-processing
Plain text loses all structure
Not actually Markdown

Method 5: Copy-paste (mostly broken)

Open Word, select all, copy, paste into a Markdown editor. What happens? Best case, you get plain text with no structure. Common case, the editor tries to interpret rich-text clipboard data and produces broken output: headings become bold paragraphs, lists lose their bullets, tables collapse into pipe-separated lines without alignment.

The clipboard format Word ships is HTML-fragment + RTF + plain text simultaneously. Different editors pick different sources, so results are unpredictable. Don't rely on this method for anything important.

Pros

Zero setup

Cons

Inconsistent across editors
Loses structure (headings, lists, tables)
Strips images entirely
Broken whitespace

How to choose

One document, want it now: MDisBetter web tool
10+ documents, comfortable with command line: Pandoc CLI
Building a Node.js app or service: Mammoth.js
You only have Word and need quick rough Markdown: Word HTML export + post-process
Anything important: not copy-paste

What about images?

The MDisBetter web tool extracts embedded images and includes them in the download as a zip when present. Pandoc with --extract-media=./media dumps images to a folder. Mammoth.js exposes images via a callback so you can save or upload them. Word native HTML export embeds images as base64 (huge file) or as separate files in a sidecar folder. Copy-paste loses images.

What about tables?

Tables are the hardest part of any Word→Markdown conversion. Simple grids convert cleanly to GFM pipe tables. Merged cells, multi-row headers, and nested tables don't have a clean Markdown representation — every tool either flattens them or drops them. For the deep dive, read our Word tables to Markdown guide.

What if my Word doc is huge?

The web tool handles documents up to 50 MB. For anything bigger (a 500-page book manuscript, a documentation archive), use Pandoc locally — no upload, no size limit, just disk space. For a multi-document migration workflow, see convert multiple Word documents to Markdown.

Common downstream use cases

Feed an LLM

Markdown is the most reliable format to send to ChatGPT, Claude, or Gemini. Tokens drop 50-70% versus pasting Word's underlying XML. Convert, paste, ask.

Migrate to Obsidian

Drop the .md into your vault. Done. For a full migration walkthrough, see migrate your Word library to Obsidian.

Import to Notion

Notion's import accepts .md files cleanly — better than its native Word import which often loses formatting. See import Word documents to Notion.

Push to GitHub docs

Convert the Word doc, drop the .md in your docs/ folder, version-control it. See migrate Word documentation to GitHub.

Build a static site

Use the .md as input to MkDocs, Docusaurus, Hugo, or Jekyll. See build a MkDocs site from Word documents.

What about other source formats?

If your starting point isn't Word, MDisBetter has dedicated converters for each: PDF to Markdown for scanned and digital PDFs, URL to Markdown for web pages, Audio to Markdown for transcripts, Video to Markdown for video transcription. Same Markdown quality across all, same downstream uses.

Edge cases worth knowing

Track changes and comments

If your Word doc has unresolved track changes, the converters all default to accepting the changes (showing the final text, dropping the revision history). Comments are dropped by default with Pandoc and the web tool; Mammoth has a callback for comment handling. To preserve comments as Markdown annotations, run Pandoc with --track-changes=all and the comments appear inline as marked spans.

Cross-references

Word's auto-numbered cross-references ("see Section 3.2") lose their dynamic linking in Markdown. Pandoc converts them to plain text with the resolved number. Other tools do similar. After conversion, use the find-and-replace pattern in your editor to replace "see Section 3.2" with proper Markdown links to the relevant section anchors.

Page numbers, headers, footers

None of these survive conversion — Markdown has no page concept. Page numbers in Word headers/footers, recurring page-level metadata, all of it is dropped. This is correct behaviour: Markdown is a flow-based format, not a paginated one.

Custom Word styles

If your Word doc uses custom paragraph styles ("Important Quote", "Sidebar Box"), the converters typically map them to plain paragraphs (losing the visual styling) or to bold/italic spans (losing the semantic). For docs that rely heavily on custom styles, consider a Pandoc filter or Mammoth's style mapping to map specific styles to specific Markdown constructs.

Embedded objects

OLE-embedded Excel ranges, PowerPoint slides inside Word, embedded PDFs — all problematic. Most tools either skip them, convert to a static image, or drop them entirely. For data that lives in embedded Excel, extract separately to .csv and reference from the Markdown.

Choosing the right output flavor

Markdown comes in flavours: CommonMark, GitHub-Flavored Markdown (GFM), MultiMarkdown, Pandoc Markdown. For modern destinations (GitHub, Obsidian, Notion, MkDocs, Hugo), GFM is the right choice — it's a superset of CommonMark with tables and task lists. The MDisBetter web tool outputs GFM. Pandoc lets you choose: -t gfm, -t commonmark, -t markdown (Pandoc's extended flavour). For maximum portability, GFM is the right default in 2026.

Recommendation

For one-off conversions, the web tool is the path of least resistance. For ongoing workflows or batch needs, install Pandoc and bookmark the one-liner above. For tooling, Mammoth.js. Word's native export and copy-paste are last-resort options when nothing else is available — and they will cost you cleanup time on the other side. The combination of MDisBetter for casual one-offs and Pandoc for serious work covers virtually every Word-to-Markdown need without paying anyone, and gives you a fallback when one tool struggles on a specific document.

Frequently asked questions

Does Pandoc preserve Word footnotes and citations?

Yes — Pandoc has the best footnote and citation handling of any Word converter. Footnotes convert to GFM footnote syntax ([^1]). For citations, Pandoc reads Word's bibliography fields and can output to BibTeX or CSL JSON with the right flags. Mammoth.js drops footnote markers but extracts the text; the web tool preserves footnotes as Markdown footnotes.

Can I convert .doc (old Word format) to Markdown?

Not directly with most tools — they expect .docx (the modern XML-based format). To handle .doc files, open them in Word or LibreOffice and re-save as .docx, then run the .docx through Pandoc or the web tool. LibreOffice can also do this from the command line: soffice --headless --convert-to docx old.doc.

Will the converted Markdown work with Hugo/Jekyll/Astro static site generators?

Yes — the output is standard GitHub-Flavored Markdown, which all major static site generators accept. You may want to add YAML frontmatter at the top (title, date, slug) for the SSG to pick it up correctly. A simple shell script can prepend frontmatter to every converted file in seconds.