Pricing Dashboard Sign up
Recent

URL to Markdown for Developers — Save Docs Locally

The docs you're building against live on someone else's server. They get reorganised, deprecated, paywalled, or quietly edited at 2 AM and your code suddenly references behaviour that no longer exists. Convert each page to Markdown, commit it next to your source, and the docs become a stable, greppable, AI-indexable artifact you actually own.

Why this is hard without the right tool

  • Documentation changes without notice — a method signature shifts and yesterday's code now looks wrong
  • Need offline access to docs on a plane, in a SCIF, or on a flaky hotel Wi-Fi
  • Cursor / Copilot / Claude Code can't feed live web docs into your workspace as context
  • Vendor docs sites are SPA-heavy and break when you "Save Page As HTML"
  • JavaScript-rendered docs return blank when you curl them for a quick grep

Recommended workflow

  1. Open /convert/url-to-markdown and paste the documentation URL
  2. We render the page in a headless browser, strip nav and side rails, and emit Markdown
  3. Click Convert, download the .md file, and drop it into docs/vendor/ in your repo
  4. Reference the local copy from code comments via relative paths
  5. Re-run the conversion on each vendor release; git diff shows exactly what changed. For automation across many pages, use an OSS pipeline (Trafilatura + a sitemap parser) — we don't expose a programmatic API yet

Code examples

Pin a vendor docs page next to your code

<!-- docs/vendor/stripe-webhooks.md -->
---
source_url: https://stripe.com/docs/webhooks
fetched: 2026-05-10
---

# Stripe Webhooks

Webhooks are how Stripe notifies your application of events...

Frequently asked questions

How does this handle JavaScript-rendered docs sites?
The web tool runs pages through a headless browser before extraction, so SPAs (Docusaurus, Nextra, GitBook, MkDocs Material with JS theme) render before we read them. A naive <code>curl | pandoc</code> pipeline would get a near-empty shell; the converter waits for the actual content to mount.
Can I crawl an entire docs subdomain at once?
Not via mdisbetter directly — the web tool converts one URL at a time. For full-site crawls, roll your own with a 30-line Python script: parse the <code>sitemap.xml</code>, fetch each URL, run through Trafilatura (or Playwright + html2text for JS-heavy docs sites), write to <code>.md</code> files mirroring the URL structure. A 200-page docs subdomain processes in a few minutes that way.
How do I keep the local copy in sync with the live site?
Re-convert (manually for one page, or via your own scripted crawl for a whole site) and commit. <code>git diff</code> on the resulting Markdown becomes a perfect changelog of what the vendor edited — far more useful than monitoring their RSS feed, since most vendors don't announce minor doc edits. Wire your own GitHub Action that runs the crawl script on schedule and opens a PR with the diff.
Will Cursor and Claude Code index the saved Markdown?
Yes — both index the workspace by default. Drop the converted docs into <code>docs/</code> or anywhere in your repo and the AI assistant will surface relevant snippets when you ask about the documented APIs. Far more reliable than pasting URLs into the chat.
What about pages behind a login (private dashboards, internal wikis)?
The MDisBetter web tool fetches the URL anonymously, so authenticated pages won't work directly. The right path for private docs is a self-hosted script using <code>requests</code> with your session cookie or bearer token, then a Markdown converter (html2text, markdownify) on the returned HTML. Don't do this on sites where you don't have permission — the OSS path doesn't bypass auth, it just uses the credentials you already have.

Try the tool free →