May 10, 2026 · 11 min read · MDisBetter

Word Formatting Preservation: Accuracy Test Across 5 Converters

Most Word-to-Markdown benchmarks test many tools across a variety of documents. This one inverts: one carefully-constructed test document, scored across 5 converters, feature by feature. The document includes every Word feature that commonly fails in conversion: H1-H4 headings, bulleted and numbered lists with deep nesting, tables (simple and merged-cell), images with captions, code blocks with syntax highlighting, footnotes, in-text citations, equations, and cross-references. Per-feature scoring shows exactly where each converter wins and loses — more useful than aggregate scores for picking a tool by your specific feature needs.

The test document

A 24-page synthetic document constructed to exercise every common Word feature. Contents:

Title page with subtitle and author
Table of contents (auto-generated by Word)
Section 1: H1 with H2 and H3 subsections, normal paragraphs, bold/italic/underline
Section 2: Bulleted list (3 levels of nesting), numbered list (3 levels of nesting), mixed list with bullets and numbers at different levels
Section 3: Three tables — a simple 4-column table, a table with merged horizontal header cells, a table with multi-row headers and merged vertical cells
Section 4: 5 inline images with captions, 1 large full-width image, 2 images placed side-by-side
Section 5: 4 code blocks (Python, JavaScript, bash, SQL) with proper Word "Code" style applied
Section 6: 12 footnotes throughout, 8 in-text citations using Word's bibliography feature, 3 cross-references to other sections (e.g., "see Section 3.2")
Section 7: 6 equations — 3 inline, 3 display
References section: full bibliography (10 entries) generated from Word's reference manager

Tools tested

Pandoc 3.5
MDisBetter (web tool, May 2026 build)
Word2MD.net (paid tier with AI alt text)
Mammoth.js 1.8
Word native HTML export → html2md (the "do it yourself" baseline)

Scoring

Each feature scored 0-5: 5 = preserved exactly, 3 = preserved with minor degradation, 1 = preserved with significant loss, 0 = dropped or broken.

Per-feature results

Feature	Pandoc	MDisBetter	Word2MD	Mammoth	Word HTML
H1 headings	5	5	5	5	3
H2 headings	5	5	5	5	3
H3 headings	5	5	5	4	2
H4 headings	5	4	4	3	2
Bold / italic / underline	5	5	5	5	4
Bulleted list (1 level)	5	5	5	5	3
Bulleted list (3-level nesting)	5	4	4	4	2
Numbered list (1 level)	5	5	5	5	3
Numbered list (3-level nesting)	5	4	4	3	2
Mixed list (numbered + bulleted)	4	3	3	3	1
Simple table	5	5	5	5	3
Table with merged horizontal cells	3	3	4	2	2
Table with multi-row headers + merged vertical	2	2	4	1	2
Inline images	5	5	5	5	3
Image captions	4	3	4	2	2
Side-by-side images	2	2	3	2	2
Image alt text (AI-generated)	0	0	5	0	0
Code blocks (with language)	5	4	3	2	1
Code blocks (without language)	5	5	5	4	2
Footnotes (as GFM footnotes)	5	2	2	1	0
In-text citations	5	1	2	1	1
Cross-references	3	2	2	2	1
Inline equations	5	3	3	3	2
Display equations	5	3	3	2	2
Bibliography section	5	3	3	3	3
Table of contents	4	3	3	3	3
Total /130	110	91	97	79	56

Pandoc — 110/130 (85%)

Wins or ties on 19 of 26 features. Dominates on footnotes, citations, equations, and bibliography — the academic/legal/technical features. The only feature where it scores 0 is AI image alt text, because Pandoc has no AI integration.

Where Pandoc scores below 5: complex tables (no GFM equivalent for merged vertical cells), side-by-side images (Markdown doesn't support image floating), AI alt text (not a Pandoc concern). On every other feature, Pandoc is best-in-class.

Word2MD — 97/130 (75%)

Strong all-around with one specific advantage: AI image alt text scores 5 where every other tool scores 0. Also wins on complex tables thanks to the HTML <table> fallback that preserves rowspan/colspan. Loses to Pandoc on footnotes, citations, equations.

For image-heavy AI workflows, Word2MD's AI alt text is uniquely valuable — adds 5 points that no other tool provides.

MDisBetter — 91/130 (70%)

Strong on the basics: heading preservation, list nesting, simple tables, code blocks, image extraction. Loses ground on the academic features: footnote handling is inline (loses GFM footnote structure), citations are plain text (no Word bibliography integration), equations are preserved as images rather than LaTeX.

Where MDisBetter wins: the basics are clean, output is portable, no install or signup required, multi-format breadth across Word + PDF + URL + audio.

Mammoth.js — 79/130 (61%)

Solid for a free JavaScript library. Clean semantic output (paragraphs, lists, simple tables, headings). Weak on academic features and complex tables. Best used as a building block in custom tooling, not as an end-user tool.

Word HTML export → html2md — 56/130 (43%)

The "do it yourself" baseline. Word's native HTML export is famously bloated, with inline styles, MSO-specific tags, and font declarations everywhere. Even after running through html2md, formatting is lost: heading levels degrade, lists flatten, tables become bloated.

This path is only worth using if no other tool is available. Even MDisBetter's free tier is dramatically better.

Per-category subtotals

Category	Pandoc	MDisBetter	Word2MD	Mammoth	Word HTML
Headings (4 features)	20	19	19	17	10
Lists (5 features)	24	21	21	20	11
Tables (3 features)	10	10	13	8	7
Images (4 features)	11	10	17	9	7
Code (2 features)	10	9	8	6	3
Academic (5 features)	23	11	12	10	7
Document structure (3 features)	12	11	7	9	11

Picking by feature you care about

If headings + lists + simple tables matter most

Almost every tool handles these well except the Word HTML path. Pick on convenience: MDisBetter for zero install, Pandoc for CLI/batch.

If footnotes, citations, equations matter most

Pandoc, no contest. Other tools degrade significantly on academic features.

If image alt text matters most

Word2MD's AI alt text is the differentiator. Hyperleap AI also offers this (not tested in this 5-tool comparison).

If complex tables matter most

Word2MD's HTML <table> fallback preserves merged cells. Less portable but most accurate. For Pandoc and others, plan to manually fix complex tables.

If code blocks matter most

Pandoc preserves language hints reliably. MDisBetter close second. For docs with lots of language-tagged code, Pandoc.

What survives the worst across the board

Three features that no tool handles well in current Markdown:

Side-by-side images — Markdown has no image floating syntax. All tools flatten to sequential.
Cross-references — Word's "see Section 3.2" feature loses its dynamic link in Markdown. Most tools convert to plain text.
Multi-row headers + merged vertical cells — GFM doesn't support these. Even the HTML fallback (Word2MD) only partially helps because not every Markdown renderer respects rowspan.

For docs heavy in any of these features, plan for manual cleanup regardless of which tool you use.

What changed since 2024

The biggest shift is AI alt text becoming a real feature. In 2024 no tool offered it; today Word2MD and Hyperleap have it shipping. Expect every major paid tool to ship AI alt text by end of 2026 — pricing it into the differentiation comparison.

Pandoc has continued steady improvement on edge cases (better handling of Word's legacy WordArt, better OOXML conformance) but the broad ranking has been stable for 5+ years.

How to test on your own documents

The honest answer: this benchmark uses one synthetic doc. Your docs are different. To replicate:

Pick one of your real, representative Word docs — ideally one that includes the features you actually care about
Convert it through 2-3 candidate tools (don't bother with the Word HTML path)
Open the .md outputs side-by-side with the Word original
Score each on the features that matter to your workflow (skip features you don't use)
Pick the tool that wins on your weighted score

Within 30 minutes you have your-corpus data. Generic benchmarks (this one included) point in a direction; your-corpus tells you the answer for your use case.

What about other source formats?

Most workflows mix Word with PDF and URL. Same logic applies: pick the right converter per format. See best free PDF to Markdown converters for PDF and URL to Markdown ranked review for web. Cross-format pipelines often end up with a mix: Pandoc for Word, marker for PDF, Trafilatura for URL — or one platform like MDisBetter for all four formats.

Output cleanliness comparison

Beyond raw feature preservation, the cleanliness of the resulting Markdown source matters for ongoing editing. Score each tool on source readability:

Tool	Line wrapping	Trailing whitespace	Empty paragraphs	Stray HTML	Cleanliness /20
Pandoc	Configurable	None	Rare	Minimal	17
MDisBetter	Configurable	None	Rare	Minimal	16
Word2MD	Default wrap	None	Occasional	HTML for complex tables	14
Mammoth.js	Default wrap	Some	Common	None	13
Word HTML path	Erratic	Common	Common	Heavy MSO HTML	5

Pandoc edges ahead on cleanliness because of --wrap=none producing diff-friendly source. The web tools default to wrapping at standard widths which is fine for reading but produces noisy diffs in Git.

Performance comparison

Conversion speed isn't usually a concern for one-off use, but matters for batch jobs. Approximate speeds for our 24-page test document:

Tool	Time per doc	Throughput per minute
Pandoc (CLI)	0.4-0.8 sec	~100 docs
Mammoth.js (in-process)	0.2-0.5 sec	~150 docs
MDisBetter (web upload)	3-6 sec	~12 docs
Word2MD (web upload)	4-8 sec	~10 docs

Web tools include upload + processing + download time. Local tools are dramatically faster but require setup. For batch processing speed, Pandoc and Mammoth dominate.

Recommendation

For complex docs (academic, legal, technical with footnotes/citations/equations): Pandoc. For image-heavy docs going into AI workflows: Word2MD. For convenience and multi-format breadth: MDisBetter. For developer integration: Mammoth.js. Skip Word's native HTML export. See also 8-tool benchmark, 2026 ranked review, and Word tables to Markdown guide.

Frequently asked questions

Why do all tools score the same low number on side-by-side images?

Markdown has no syntax for image positioning — there's no float left/right or column layout. Every tool has to flatten side-by-side images to sequential ones, which is technically correct (the image content is preserved) but visually different from the original. To preserve side-by-side layout, you'd need to use raw HTML in the Markdown, which is the standard escape hatch for layout features GFM doesn't support natively.

How is the 'AI image alt text' score relevant if I'm not using AI?

It's only relevant if your downstream workflow includes LLMs (RAG, document Q&A, ChatGPT-as-reviewer) or accessibility requirements (screen readers). For simple human reading or static doc sites, the original Word alt text (often empty or generic) is fine. The score line shows the differentiator for the use case where it matters; ignore it for the use case where it doesn't.

Can I see the actual converted output for the test document?

We don't publish the converted .md files because the test document includes synthetic-but-realistic content (fake citations, fake company names) that could be misleading if extracted. The methodology — feature list, scoring rubric, tool versions — is fully reproducible. Build a similar test doc with your real-world feature mix and run the same comparison; that's the more useful data anyway since it reflects your actual content.