May 10, 2026 · 8 min read · MDisBetter

How to Feed a 200-Page PDF to ChatGPT (Step-by-Step)

You have a 200-page legal brief, technical manual, or research compendium and you want ChatGPT to answer questions about it. Drop the PDF into the chat and one of two things happens: ChatGPT silently truncates after the first 50 pages, or it accepts the whole thing and answers from a fragment of the actual content. Neither is what you want. Here's the workflow that actually works.

Why 200 pages doesn't fit (the context window math)

ChatGPT's context window depends on the model: 128k tokens for GPT-4o, 200k+ for GPT-5 and newer reasoning models, ~32k for older or mobile-optimized variants. A token is roughly 0.75 words of English prose, but a token of layout-noisy PDF extraction is much less efficient — ChatGPT sees about 2,000-2,500 tokens per page of dense PDF content (vs ~500-700 tokens per page of clean Markdown).

The math: 200 pages × 2,200 tokens/page (PDF) = 440,000 tokens. Way over even GPT-5's window. Same 200 pages × 600 tokens/page (Markdown) = 120,000 tokens. Still tight on GPT-4o, but feasible. That's why step one of the workflow is always conversion.

Step 1 — Convert the PDF to Markdown

Drop your 200-page PDF into our PDF to Markdown converter. The conversion itself takes 30-90 seconds for a document that long; output is a single .md file (or a ZIP of .md files if you ask for per-section splitting at conversion time).

Why this matters even when the document still won't fit: the resulting Markdown has explicit ## headings that map to the document's actual sections. That means step 2 (splitting) becomes trivial — you split on heading boundaries that respect the document's structure, not arbitrary character counts.

Step 2 — Split into logical sections

The right split is by ## heading. Most 200-page documents have 10-30 H2 sections (chapters, parts, major topics). Each section becomes its own chunk; each chunk fits in any modern model's context window with room to spare.

If your document has only H1 sections that are themselves too large, split further by H3, or by paragraph count if the structure is shallow. Keep chunks under 30,000 tokens so you have headroom for prompts and Claude's response.

You can do this manually (open the Markdown, select-and-copy each section), or programmatically with a few lines of Python:

import re

with open('document.md') as f:
    md = f.read()

sections = re.split(r'\n(?=## )', md)
for i, section in enumerate(sections):
    with open(f'section_{i:02d}.md', 'w') as f:
        f.write(section)

For automated splitting with token-aware boundaries, see our text chunker tool.

Step 3 — Feed sequentially with summaries

For one-shot questions across the whole document ("summarize the main argument", "list every mention of X"), use this pattern:

Start a new ChatGPT conversation
Send section 1 with: "This is section 1 of N from a longer document. Read it and give me a 3-sentence summary."
ChatGPT responds with the summary
Send section 2 with: "This is section 2 of N. Read it. Here are the previous summaries: [paste summary 1]. Give me a 3-sentence summary of section 2."
Continue for all sections
Final prompt: "Based on these N summaries, [your real question]"

This works because each section sits in a single context window where ChatGPT can reason over it fully, and the running summaries act as compressed long-term memory.

Alternative — use RAG for massive documents

If you'll query the document repeatedly, the sequential-summary workflow is wasteful. Build a RAG (retrieval-augmented generation) pipeline once and query it many times.

Minimal version with LangChain:

from langchain_text_splitters import MarkdownHeaderTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

with open('document.md') as f:
    md = f.read()

splitter = MarkdownHeaderTextSplitter(headers_to_split_on=[
    ('#', 'h1'), ('##', 'h2'), ('###', 'h3'),
])
chunks = splitter.split_text(md)

store = Chroma.from_documents(chunks, OpenAIEmbeddings())

# Then for any question:
relevant = store.similarity_search(question, k=5)
# Feed only those 5 chunks to ChatGPT — fits any context window

The full setup pattern is documented in our PDF to Markdown for RAG guide. RAG turns a 200-page document into a queryable knowledge base; ChatGPT only sees the relevant 5-10 chunks per question, costs drop to cents per query, and answers stay sharp.

Which approach to pick

Sequential-summary if you have 1-3 questions to ask. Section-by-section direct query if your questions naturally map to specific sections. RAG if you'll ask many different questions over time, or if you want to share the document with a team that will query it independently.

For all three approaches, step 1 is identical and non-negotiable: convert the PDF to Markdown first. Skipping that step quintuples your token cost and degrades answer quality on every subsequent step.

Frequently asked questions

Why not just truncate the PDF and feed the first part?

ChatGPT's answer will be confidently wrong about anything that lives in the truncated portion. Splitting by heading and processing every section produces complete coverage; truncating produces silent blind spots.

Does this approach work with Claude or Gemini?

Yes — Claude has the easiest version (200k native context fits most converted documents in one shot) and Gemini 2.5 Pro fits even more (1M tokens). The same conversion + chunking pattern works on all three.

Can I automate this entire workflow?

The conversion step needs an OSS library since we don't offer a programmatic API today — Marker, Docling, or PyMuPDF all do PDF-to-Markdown in Python. LangChain or LlamaIndex can handle splitting and querying. End-to-end automation runs in 50 lines of Python.