How image extraction works
The .docx file is a ZIP. Images live in word/media/ as image1.png, image2.jpeg, etc. We extract them, derive cleaner names where possible (using the image's alt text or surrounding heading), and emit Markdown image references like . The result is a Markdown file plus an images/ folder, ready to drop into a static-site generator, an Obsidian vault, or a GitHub repo.
What's preserved about each image
Alt text (Word's "Alternative Text" property → Markdown alt): preserved verbatim, critical for accessibility and AI ingestion. Caption (Word's caption paragraph below the image): kept as a paragraph immediately after the image reference. Position relative to text: the image appears in the Markdown at the point in the flow where it was anchored. Image format: kept as-is (PNG stays PNG, JPEG stays JPEG) — no re-encoding, no quality loss.