Pricing Dashboard Sign up
Recent
· 7 min read · MDisBetter

Claude Can't Read My PDF — 3 Fixes That Actually Work

Claude is unusually good at long documents — except when it isn't. You upload a PDF, ask a specific question, and Claude either misses entire sections, cites the wrong page, or politely declines to answer something that's clearly in the document. The fix is rarely "better prompting". It's almost always about how the PDF reaches Claude.

Why Claude struggles with certain PDFs

Claude has a 200,000-token context window — generous, but not infinite, and not equally usable across that range. Anthropic's research has shown that recall on long contexts degrades faster when the input contains layout noise, repeated boilerplate, or heavily-broken paragraph structure. PDFs deliver all three by default.

What makes Claude particularly sensitive: its training corpus is heavy on Markdown-formatted technical documentation. Claude treats ## as a real semantic anchor. When your PDF gets extracted as a wall of text without those anchors, Claude has to infer structure from indentation and font cues that the extraction pipeline often gets wrong. The result is Claude reasoning over a noisy, incorrectly-structured version of your document.

Fix 1 — Convert to Markdown (works 95% of the time)

The single highest-leverage fix. Convert the PDF to Markdown before uploading to Claude.

  1. Drop your PDF into the PDF to Markdown for Claude converter
  2. Download the resulting .md file
  3. Start a new Claude.ai conversation (or open Claude Projects)
  4. Attach the .md file or paste it inline
  5. Re-ask your original question

What changes: layout noise gone, headings explicit, tables in GFM format Claude reads natively, equations in LaTeX (if your PDF had any), citations properly attached to their sentences. Same content, dramatically better signal.

Token-wise, you'll typically see a 60–80% reduction. On a 100-page report that's the difference between filling Claude's context window and using a fraction of it. Claude can then spend that capacity on reasoning instead of parsing.

Fix 2 — OCR first for scanned documents

If your PDF is a scan (image-only, no text layer), Claude's internal extraction is essentially OCR — and it's not specialized for that task. Symptoms: Claude hallucinates content that isn't in the document, or claims sections are missing when they're clearly visible.

Our converter runs proper OCR automatically when it detects no text layer. The output is regular Markdown, indistinguishable to Claude from a digital-PDF source. For very low-quality scans (faxed pages, phone photos), expect 90–98% character accuracy; for clean 300+ DPI scans, near-perfect.

If you have an exceptionally important scanned document (legal contract, medical record), spot-check the converted Markdown against the source before relying on Claude's answers — OCR is excellent now but not infallible. We cover the full workflow in the scanned PDF to Markdown guide.

Fix 3 — Split large PDFs into sections

If your converted Markdown is still over Claude's context window (typical for 500+ page documents), splitting beats truncation. Two strategies work well:

Strategy A: Split by section, ask per-section

Split the Markdown by ## headings. Each section gets its own conversation; you ask the same question across sections and aggregate the answers. Works well for questions like "summarize each chapter" or "find every mention of X".

Strategy B: Section + running summary

Process sections sequentially in one conversation. After each section, ask Claude to produce a short summary; carry only the summary forward when you load the next section. The summaries collectively act as compressed long-term memory.

Claude Projects — the best workflow for repeated questions

If you'll consult the same document repeatedly, Claude Projects is built exactly for this. Convert your PDFs to Markdown once, drop them into the Project's knowledge base, and every conversation in that Project starts with clean structured context — without re-paying the parsing tax.

Setup: open Claude.ai, create or open a Project, click "Add to knowledge", upload the .md file (or several). The Project owner pays tokens for the knowledge content once per conversation; subsequent turns within that conversation are free against the knowledge base.

This is particularly useful for: a research literature corpus you'll query for months, a product specification you'll reference per-feature, a contract portfolio you review across many small questions.

What if it still doesn't work?

If you've converted to Markdown, OCR'd if needed, fit the document in context, and Claude is still missing the answer — the issue is usually one of two things. First, the answer might genuinely not be in the document; pasting the source into Claude with "if the answer isn't here, say so" surfaces this honestly. Second, your prompt might be ambiguous — Claude is unusually literal, and a slightly more specific question often unblocks an answer that was just being filtered out by a too-broad query.

For a more general view of why model outputs degrade on PDF input, see why ChatGPT gives bad answers on PDFs — the underlying mechanics apply to Claude almost identically.

Frequently asked questions

Why does Claude handle Markdown better than PDF?
Anthropic's training corpus is heavy on Markdown documentation. Claude has internalized that <code>##</code> means "new section", a fenced code block means "do not paraphrase", and pipe-tables are tables. PDF content forces Claude to infer all of that, which costs accuracy.
Can I drop several Markdown files into one Claude conversation?
Yes — Claude.ai accepts up to ~30 attachments per conversation, and Claude Projects has higher limits via knowledge base ingestion. Cross-document reasoning works well across 5-10 attachments in a single prompt.
What's the size cap for Claude attachments?
Claude.ai accepts files up to 30 MB and roughly 500 pages of PDF. With Markdown conversion you typically fit 2-4× more document in the same context window because the format is so much more compact.