Pricing Dashboard Sign up
Recent
· 9 min read · MDisBetter

How to Feed a Website to ChatGPT (Complete Guide)

Feeding a website to ChatGPT sounds like it should be a one-step operation — paste a URL, get an answer. In practice it splits into four very different problems depending on whether you want one page, a section, an entire docs site, or something in between. The browse feature handles maybe 30% of cases well. The other 70% need a different approach. Here's the full playbook.

ChatGPT browse vs manual

ChatGPT can fetch URLs on its own. When it works, this is the easiest path. When it doesn't, the failures are silent — you get an answer that looks confident but is wrong. Knowing which approach to use saves time.

Browse works well when: the page is plain HTML, publicly accessible, not behind Cloudflare, not paywalled, not heavily JavaScript-rendered, and the question is simple enough that ChatGPT only needs the page summary.

Browse fails when:

For these cases — which add up to most real-world usage — the manual workflow wins.

Single page: the canonical workflow

For one URL you want ChatGPT to deeply understand:

  1. Open /convert/url-to-markdown.
  2. Paste the URL, hit convert.
  3. Copy the Markdown (short pages) or download the .md file (long pages).
  4. Start a fresh ChatGPT conversation. Attach the file or paste the Markdown.
  5. Ask your question.

The conversion strips navigation, ads, footers, share buttons, modals, and other layout noise — leaving the article body with structure (headings, lists, tables, quotes, links) intact. ChatGPT now has a clean, structured source it can reason over precisely.

For why this beats both copy-paste and browse, see why copy-pasting from websites ruins your AI answers.

Multi-page: a section or an entire site

For an entire documentation site, a multi-part article series, or a knowledge base, you have two strategies depending on size.

Under 50 pages

Convert each URL individually, then concatenate the Markdown files into a single document. The combined file is one upload, one prompt, one answer. ChatGPT can hold a few hundred thousand tokens of Markdown comfortably — that's typically 100-200 pages of documentation.

To concatenate quickly:

cat page1.md page2.md page3.md > combined.md

Or on Windows PowerShell:

Get-Content page1.md, page2.md, page3.md | Set-Content combined.md

Add a top-level header at the start of each section if you want ChatGPT to keep track of which page each piece came from.

Over 50 pages

At this scale, a single combined upload starts to bump against context limits. Two options:

Convert to Markdown method (why this is the right format)

The reason Markdown is the universal answer here is that LLMs read it natively. They were trained on enormous amounts of Markdown — README files, GitHub issues, Stack Overflow posts, technical blogs. Heading levels carry meaning. List indentation conveys hierarchy. Code blocks are unambiguous. Compared with HTML (verbose, full of layout junk), plain text (no structure), or PDF (a printing format the model has to reverse-engineer), Markdown is the easiest format for the model to reason over and the cheapest in tokens.

The same principle applies if your source happens to be a PDF instead of a web page — convert to Markdown first. See our PDF to Markdown converter and the deep-dive on best format for LLM input.

Chunking large sites

For a doc site over a few hundred pages — Stripe docs, AWS docs, a major framework — feeding the whole thing to ChatGPT is impractical even with chunking. The smart pattern:

  1. Convert the whole site to Markdown (one file per page).
  2. Index the chunks with embeddings (any vector store: pgvector, Pinecone, Weaviate, or even a local FAISS).
  3. For each user question, retrieve the top 5-10 relevant chunks.
  4. Send only those chunks to ChatGPT, along with the question.

This is the RAG pattern. It scales to arbitrarily large corpora because you never send more than a handful of pages per query. We cover end-to-end implementation in RAG pipeline guide — the principles transfer cleanly from PDF sources to URL sources.

Common gotchas

Login-gated content. Public converters can't authenticate. You need to be logged in yourself, then either copy the rendered article into a Markdown editor or use a browser extension that exports the current page.

Geo-restricted content. Some converters route through specific regions. If the page works in your browser but not in the converter, the fetch IP may be blocked.

Single-page apps. Some SPAs render content only after user interaction (clicking tabs, expanding sections). A static fetch sees only the initial state. For these, navigate to the specific state you want, then export from the browser.

Stale content. Your converted Markdown is a snapshot. If the source page changes, your file is outdated. For frequently-updated content, re-convert before each major use.

Doing it for Claude and Gemini

Same workflow, different upload box. Markdown is universal across LLMs. Claude in particular handles very long Markdown contexts well — useful for the multi-page case where ChatGPT might struggle. See how to feed documentation to Claude for Claude-specific tips.

The 30-second test

Next time you want to ask ChatGPT about a web page: try browse, try copy-paste, and try the URL-to-Markdown workflow. Same question, three sources. The Markdown answer is dramatically more accurate, more specific, and more grounded in the actual page content. After running this test once or twice you stop reaching for the other two.

Prompts that work better with a Markdown source

The convert-first workflow doesn't just improve answer quality on whatever question you ask. It unlocks question types that fail without it. Some examples:

When you need depth, not breadth

One more pattern worth knowing: feeding a single article to ChatGPT and asking five increasingly deep questions about it (rather than five different articles each with one question) produces better insight per minute. The convert-first approach makes this practical because the source is in context once, then reused across the whole conversation. Iterative deep-dives benefit disproportionately from clean source material.

What about Custom GPTs and Projects?

If you find yourself asking about the same site repeatedly — a vendor's documentation, a competitor's blog, a specific research source — promoting it from per-conversation upload to a persistent knowledge base saves enormous friction. ChatGPT's Custom GPTs and Claude's Projects both support knowledge bases. Convert your URLs once, upload the Markdown to the knowledge base, then chat without re-uploading anything. We cover the developer-facing version of this in how to feed documentation to Claude.

The honest summary

Browse mode is convenient when it works. Copy-paste is fast but lossy. Convert-first is the workflow that delivers consistently good answers on any web content, at any scale, on any LLM. The cost is thirty seconds per page once you have the habit. The payoff is every grounded conversation gets materially better, and the failure modes that used to derail you (silent truncation, hallucinated quotes, vague summaries) just stop happening.

A note on context window economics

It's worth understanding why feeding clean Markdown extends your effective context window. When raw HTML enters a prompt, every <div>, <span>, inline style, and embedded script consumes tokens. A typical mid-size article page in raw HTML is 8,000 to 15,000 tokens. The same article as Markdown is 1,500 to 3,500 tokens. Across a long conversation where you're feeding multiple articles, the difference compounds: you can either fit five articles of Markdown or one article of HTML in the same context budget. For comparative analysis tasks especially — synthesizing across many sources — Markdown is the difference between practical and impossible.

Frequently asked questions

Can ChatGPT remember a website I uploaded across conversations?
On ChatGPT Plus with Custom GPTs or Projects, yes — upload the converted Markdown to the project's knowledge base and it persists. In a normal chat, the file is scoped to that conversation only.
What's the maximum site size I can feed?
Practical ceiling for a single upload is in the hundreds of pages of Markdown. Beyond that, switch to a RAG pattern (chunk, index, retrieve per question) instead of trying to fit everything in one prompt.
Does converting cost ChatGPT credits or use my plan quota?
Conversion happens before ChatGPT sees anything — it's a separate step using a converter tool. Your ChatGPT plan only counts the tokens of the final upload. Because Markdown is more compact than raw HTML, your effective context goes further on the same plan.