Migrate Your Word Library to Obsidian (Complete Guide)
You've decided to move your Word document library into Obsidian. Smart move — Markdown is portable, version-controllable, future-proof, and Obsidian's graph view, backlinks, and plugin ecosystem are unbeatable for personal knowledge management. But the conversion itself is the easy part. The real work is structuring the resulting Markdown so Obsidian treats it like a native vault: proper frontmatter for metadata, sensible folder hierarchy, tags that map to your existing categories, and links restored where Word had them. Here's the complete walkthrough.
Step 1: Audit your Word library
Before converting anything, understand what you have. Walk your Word folder and inventory:
- How many .docx (and .doc) files total?
- What's the rough average size? (Affects conversion strategy.)
- Are documents grouped by folder, or all in one place?
- Are there embedded images? Tables? Cross-references between docs?
- Any consistent naming convention (project codes, date prefixes)?
This determines whether you do a one-shot bulk conversion or a progressive migration document-by-document.
Step 2: Choose your conversion path
Two reasonable paths depending on volume:
Path A: Web tool, document by document (small libraries)
For up to ~50 documents, the MDisBetter Word to Markdown converter is the fastest setup-free path. Drop one .docx, click Convert, download the .md, drop it in your vault, repeat. About 30 seconds per file. This is also the right path if you want to triage as you go — review each doc, decide whether it belongs in the vault, optionally rewrite the title or split it.
Path B: Pandoc batch (larger libraries)
For 50+ documents, install Pandoc and bulk-convert. From your Word folder:
cd /path/to/word-library
mkdir -p ~/ObsidianVault/imported
for f in *.docx; do
base="${f%.docx}"
pandoc -f docx -t gfm \
--extract-media="$HOME/ObsidianVault/imported/media" \
"$f" -o "$HOME/ObsidianVault/imported/$base.md"
echo "Imported: $base"
doneThis puts all the converted .md files into ~/ObsidianVault/imported/ with images in a shared media/ subfolder. For more on bulk conversion strategy, see convert multiple Word documents to Markdown.
Step 3: Set up your Obsidian vault structure
Don't dump 500 markdown files into the vault root. Decide on a structure first. A common pattern:
MyVault/
├── 00 Inbox/ (where new imports land)
├── 10 Projects/ (active work)
├── 20 Areas/ (ongoing responsibilities)
├── 30 Resources/ (reference material — most of your old Word docs probably live here)
├── 40 Archive/ (completed/inactive)
├── attachments/ (images, PDFs, etc.)
└── templates/This is loosely the PARA method (Projects, Areas, Resources, Archive). Adapt to your taste. The key is having a clear destination for the imported docs rather than letting them land randomly.
Step 4: Add YAML frontmatter for metadata
Obsidian uses YAML frontmatter at the top of each .md for tags, dates, custom properties, and aliases. Word didn't have an equivalent — file metadata sat in the file system or in Word's Properties dialog. To make your imported docs feel native, add frontmatter:
---
title: "Q3 2024 Strategic Plan"
date: 2024-09-15
tags: [strategy, planning, q3-2024]
aliases: ["Q3 Plan", "2024 Q3 Strategy"]
source: word
imported: 2026-05-10
---You can do this manually for important docs, or batch-script it. A simple script that pulls the H1 as title and stamps today's date:
for f in ~/ObsidianVault/imported/*.md; do
title=$(head -n 1 "$f" | sed 's/^# //' | sed 's/"/\\"/g')
date=$(stat -f "%Sm" -t "%Y-%m-%d" "$f" 2>/dev/null || stat -c "%y" "$f" | cut -d' ' -f1)
cat > "$f.tmp" <<EOF
---
title: "$title"
date: $date
tags: [imported, word]
source: word
---
EOF
cat "$f" >> "$f.tmp"
mv "$f.tmp" "$f"
doneThis adds a minimal frontmatter block to every imported file. Refine the tags as you go.
Step 5: Convert folders to tags (or keep folders)
Word libraries are usually folder-organised. Obsidian supports both folders and tags, and they serve overlapping purposes. Two strategies:
- Mirror folders: Recreate your Word folder structure inside Obsidian. Simplest. Familiar mental model.
- Flatten + tag: Put everything in
30 Resources/and use frontmatter tags to encode the original folder. Better for serendipitous discovery via tag pages and graph view.
If you flatten, tag automatically based on the original folder path:
# Run from the original Word library root
find . -name '*.docx' | while read -r f; do
folder=$(dirname "$f" | sed 's|^\./||' | tr '/' '-' | tr 'A-Z' 'a-z')
base=$(basename "$f" .docx)
pandoc -f docx -t gfm "$f" -o "/tmp/$base.md"
# Prepend frontmatter with folder as tag
printf -- '---\ntitle: "%s"\ntags: [imported, %s]\n---\n\n' "$base" "$folder" \
| cat - "/tmp/$base.md" > "$HOME/ObsidianVault/30 Resources/$base.md"
doneStep 6: Restore internal links
Word documents that referenced each other ("see the Q2 Plan document") become orphaned text after conversion. Two ways to fix:
Manual approach
For a small library, walk through each imported doc, find references to other docs, and replace them with Obsidian wikilinks: [[Q2 2024 Strategic Plan]]. Obsidian autocompletes these as you type — fast for someone who knows the corpus.
Scripted approach
For a larger library, find references to known doc titles and rewrite them. A Python sketch:
import re
from pathlib import Path
vault = Path.home() / 'ObsidianVault' / '30 Resources'
md_files = list(vault.glob('*.md'))
titles = {f.stem for f in md_files}
for f in md_files:
text = f.read_text(encoding='utf-8')
for title in titles:
# Replace 'Q2 Plan' with '[[Q2 Plan]]' (case-insensitive, word boundary)
text = re.sub(rf'\b{re.escape(title)}\b', f'[[{title}]]', text, flags=re.IGNORECASE)
f.write_text(text, encoding='utf-8')Be careful — naive replacement can create false-positive links (e.g., if a doc is titled "Plan" and the word "plan" appears 50 times in another doc). Test on a small subset first.
Step 7: Handle images
If you used Pandoc with --extract-media, images are in a separate folder. Move them into your vault's attachments/ folder and update the references:
mv ~/ObsidianVault/imported/media/* ~/ObsidianVault/attachments/
# Rewrite paths in all imported .md files
find ~/ObsidianVault/imported -name '*.md' -exec sed -i '' \
's|imported/media/|../attachments/|g' {} +Obsidian also supports embedded images via wikilinks: ![[image.png]]. If you prefer this syntax over Markdown image syntax, do a second find-replace pass.
Step 8: Set up Obsidian for the imports
In Obsidian Settings:
- Files & Links → Default location for new attachments: set to
attachmentsfolder - Files & Links → Use [[Wikilinks]]: enable for native Obsidian feel
- Core Plugins → Templates: enable, point at your
templates/folder - Install community plugins: Dataview (query frontmatter), Tag Wrangler (rename tags), Templater (smarter templates)
Step 9: Verify
Open Obsidian, navigate to a few random imported docs. Check:
- Headings render correctly (H1 down to H4)
- Lists are properly nested
- Tables display
- Images load
- Frontmatter shows in the Properties pane
- Tags appear in the right sidebar
- Graph view shows the imports as nodes
For docs that look broken, re-convert that specific .docx through the web tool — sometimes the web tool handles a doc that Pandoc choked on, and vice versa. Or read the formatting preservation test for known edge cases.
Step 10: Iterate
The first import is never perfect. Plan to spend an hour reviewing the first 20-30 docs that landed, fixing edge cases, and refining your tagging. As you do, patterns emerge — a specific Word style that always converts wrong, a class of doc that needs different handling, a tag taxonomy that needs renaming. Adjust your script and re-run if needed; the conversion is idempotent.
Bonus: keep new Word docs out of your library
Once your vault is set up, the goal is to stop creating new Word docs. Two transition tactics:
- Use a Markdown editor (Obsidian itself, Typora, iA Writer) for new long-form work
- For collaborative documents, write in Markdown and only export to Word when sharing with non-Markdown users — Pandoc handles MD → DOCX too
What about other formats in your knowledge base?
Most knowledge bases are mixed. PDFs from research papers and reports, web articles you've saved, audio recordings of meetings. The same destination strategy works: PDF to Markdown for papers, URL to Markdown for web saves, Audio to Markdown for transcripts. All produce Obsidian-ready .md.
Plugins worth enabling for imported docs
Obsidian's plugin ecosystem has several plugins that pay off specifically for imported Word docs:
- Dataview — query frontmatter to build dynamic indexes. Build a "recently imported" page in 3 lines:
TABLE imported, source FROM "30 Resources" WHERE source = "word" SORT imported DESC - Tag Wrangler — rename tags across the vault with a single command. Essential when you discover your initial tagging needs renaming.
- Templater — auto-apply frontmatter templates based on folder location. Drop a file in
30 Resources/and it auto-gets the right template. - Advanced Tables — interactive editing for the GFM tables that came out of conversion (auto-format columns, insert rows/columns)
- Linter — clean up Markdown formatting issues across imported files (consistent heading levels, trailing whitespace, etc.)
- Image Converter — bulk-process imported images (resize, convert formats, optimise for web)
Common post-migration cleanup tasks
After importing, expect to spend a few hours on cleanup. Common tasks:
- Find and merge near-duplicate docs (different versions of the same content from your Word library)
- Split very large imports into smaller atomic notes (Obsidian works best with smaller, more focused notes)
- Add backlinks where Word docs referenced each other
- Move misplaced docs to better folders
- Add tags to make tag-based discovery work
- Delete docs that are obviously obsolete
This cleanup is the difference between a vault that feels like a real second brain and one that feels like a junk drawer of converted files.
Sync and backup
Once your vault has hundreds of imported notes, set up sync and backup:
- Obsidian Sync — official paid sync, encrypted, simple
- iCloud / Dropbox / OneDrive — point your vault folder at a synced folder; works for most setups
- Git — version-controlled vault, free, perfect for tech-friendly users (use the Obsidian Git plugin to auto-commit)
Whatever you choose, set it up before you finish the migration. Losing 500 hand-tagged notes to a hardware failure is preventable misery.
Recommendation
For libraries under 50 docs, do it in one sitting via the web tool, hand-tag as you go. For 50+, install Pandoc, bulk-convert, then spend 1-2 hours on frontmatter, tags, and link restoration. Either way, structure the vault first and import second — the worst migration is the one where you dump 500 .md files into the vault root and try to organise after the fact. See also the 8-tool benchmark for choosing between converters and word tables guide for table-heavy docs.