May 10, 2026 · 8 min read · MDisBetter

PDF to Markdown for Obsidian — Vault Setup Guide

Obsidian was built for Markdown. PDFs in your vault are dead-end attachments — no search, no graph, no backlinks. Convert your PDF library to Markdown and the same content becomes first-class vault material: searchable, linkable, visible on the graph, and ready for Zettelkasten workflows. Here's the complete setup, from first conversion to a working knowledge base.

Why Obsidian + Markdown is the right pairing

Obsidian is a Markdown editor first and a database second. Every .md file in your vault becomes a node: indexed by full-text search, available for [[wikilinks]], visible in the graph view, taggable, queryable via Dataview. PDFs in your vault get none of this — they're stored, not understood.

Three concrete capabilities you unlock by converting:

Search across content: ctrl-O to find any phrase across your entire vault, including converted PDF content. The search index treats Markdown as first-class data.
Wikilinks: [[Smith2026]] in any note creates a backlink to your converted Smith 2026 paper, visible in the source's backlinks panel and in the graph view.
Graph visualization: the relationships between your converted documents and your own notes become visible — a real Zettelkasten emerges from what was just a folder of PDFs.

Converting your PDF library

Single document

For a single PDF: drop it into our PDF to Markdown for Obsidian converter, download the .md, save it into your vault. Two minutes start to finish.

Many PDFs at once

For batch ingestion (e.g., a literature review with 200 papers), MDisBetter is a web tool today and doesn't yet ship a public API or CLI — so the right path for true batch is local OSS that runs on your machine. Marker in a Python loop or Docling handle hundreds of PDFs offline; the full step-by-step is in batch convert 100+ PDFs to Markdown. The output is a folder of .md files ready to drop into your vault.

Continuous ingestion

If you keep adding new PDFs (research papers from arXiv, downloaded reports), pair an OSS converter with a folder watcher. A short Python script using watchdog + Marker (or Docling) watches an "Inbox" folder and emits the converted Markdown into "Sources" the moment a new PDF lands. Pair with Dropbox or iCloud sync and you can drop PDFs from anywhere — it's a one-time setup of about thirty lines, and it runs locally so nothing leaves your machine.

Vault organization patterns

Two main schools of thought:

Topic-based folders

Vault/
  Topics/
    Machine Learning/
      Smith2026 - Transformers.md
    Statistics/
      Wasserman2024 - All of Statistics.md
  Permanent Notes/
  Daily Notes/

Pros: easy to browse by subject. Cons: many papers belong to multiple topics — you have to pick one.

Source-only flat structure

Vault/
  Sources/
    Smith2026 - Transformers.md
    Wasserman2024 - All of Statistics.md
  Permanent Notes/
  Daily Notes/

Pros: no "which folder?" problem. Cons: relies entirely on tags and links for organization.

The flat-structure approach plays better to Obsidian's strengths (graph view, backlinks) and is the one most serious Obsidian users converge on. Use folders only for high-level separation (Sources vs Notes), not for topic categorization.

YAML front matter for metadata

Add a YAML block at the top of each converted note for searchable metadata:

---
title: Attention Is All You Need
authors: [Vaswani et al.]
year: 2017
type: paper
tags: [nlp, transformers, foundational]
source: 'arxiv.org/abs/1706.03762'
aliases: [Transformer paper, AIAYN]
---

Obsidian indexes everything in front matter for the file properties panel, Dataview queries, and graph filtering. Aliases let other notes link to this paper by any of the alternative names you list.

For batch-converted libraries, you can prepend a default YAML block programmatically — pull authors and year from filename or PDF metadata, set sensible defaults for tags. Five lines of Python.

Wikilinks and backlinks

The killer Obsidian feature for converted PDFs is wikilinks. From a permanent note, link to a converted source: [[Smith2026 - Transformers]]. The link works by filename. The source's backlinks panel now shows your permanent note as a referrer; the graph view shows an edge between them.

Best practice: use aliases in front matter so you can link by short references. With aliases: [Transformer paper, AIAYN] in the source, all three of these work and resolve to the same note: [[Smith2026 - Transformers]], [[Transformer paper]], [[AIAYN]].

Tagging for cross-cutting themes

YAML front matter tags are great for paper-level themes. In-note #tags work for ideas that appear in only part of a note. A converted paper on graph neural networks might have tags: [gnn, deep-learning] at the top and #message-passing sprinkled in the section that discusses message passing.

The right level of tagging is taste — start broad (handful of tags per paper) and add granularity as patterns emerge. Don't over-engineer the tag taxonomy upfront.

Zettelkasten workflow

The classic Zettelkasten pattern with converted papers:

Source notes (your converted papers) live in Sources/ — read-only, cite-able, structured
Literature notes in Lit/ — your written summary of each source in your own words, linked to the source
Permanent notes in Permanent/ — atomic ideas, each linking to the literature notes that influenced it

The graph view, after you've built up a few months of permanent notes, looks like a real concept map of your field — exactly what Luhmann's analog Zettelkasten was designed to produce. The converted PDFs are the substrate; your permanent notes are the contribution.

Useful Obsidian plugins for converted PDF workflows

Citations: BibTeX integration for academic libraries, autocomplete cite keys
Dataview: query YAML front matter (e.g., "all papers from 2024 tagged 'transformers'")
Smart Connections: semantic search across the vault, useful when you don't remember exact wording
Note Refactor: split a long converted paper into linked sub-notes by section
Templater: standardize your literature-note template so every paper gets the same scaffolding

Combined with the converted Markdown, these turn Obsidian into a credible academic-knowledge platform — without the cost or rigidity of dedicated tools like Roam, Tinderbox, or Citavi.

Frequently asked questions

Should I delete the original PDFs after converting?

No — keep them. Markdown is for searching and linking; the original PDF is the canonical source you can return to for exact quotes, figures, or layout. Store PDFs in a separate folder (or even outside the vault) and reference them from front matter.

How do I handle figures from PDFs?

Our converter extracts images from PDFs alongside the Markdown. Save them in a vault subfolder like <code>Sources/images/</code> and reference them with relative Markdown image links. Obsidian renders them inline.

Can I convert and import directly to Obsidian without a manual step?

With our CLI watch mode pointing at your vault\u2019s Inbox folder, yes — drop a PDF, get a Markdown note. The note appears immediately in Obsidian; the original PDF stays in Inbox or moves to your archive folder.