How do I convert a PDF to Markdown for free?

Upload your PDF to mdisbetter.com, click Convert, and get clean structured Markdown in seconds. No signup, no installation — it works directly in your browser.

Why is Markdown better than PDF for AI?

Markdown reduces token usage by up to 95% compared to PDF when feeding documents to AI models like ChatGPT or Claude. PDF contains layout metadata, fonts, and binary data that waste tokens. Markdown preserves only the content structure that AI actually needs.

What file types can MDisBetter convert to Markdown?

MDisBetter converts PDF, Word (.docx), plain text, YouTube videos (transcript extraction), audio files (MP3, WAV, M4A, OGG, FLAC, WEBM), and any web page URL to clean Markdown.

Is MDisBetter free to use?

Yes, MDisBetter is completely free. You get 10 conversions per day with no signup required. All tools work directly in your browser.

How do I extract a YouTube transcript as Markdown?

Paste the YouTube video URL into the YouTube to Markdown tool on mdisbetter.com and click Convert. The tool extracts the transcript and structures it as clean, formatted Markdown with headings and timestamps.

Can this extract articles from sites with soft paywalls?

Yes — when the article body is in the HTML (just visually hidden by an overlay), we extract it. We do not break hard paywalls that gate content server-side; if the article isn't in the HTML you receive, no scraper can recover it. The distinction matters legally and technically.

Are bylines and publication dates preserved?

Yes — extracted from schema.org/NewsArticle structured data when available, falling back to visible byline parsing. Both go into YAML front matter, and the published date is normalised to ISO 8601 for downstream sorting.

How are mid-article video and ad blocks handled?

Video players, ad iframes, and "Story continues below" injection blocks are detected by their wrapper classes and stripped. The surrounding paragraphs are stitched back together so the prose flows continuously, the way it reads in a print edition.

Does this work for live-blog and breaking-news pages?

Live blogs are detected by their characteristic structure (timestamped entries, reverse-chronological ordering). We emit each entry as a Markdown section with its timestamp as the heading, and reverse the order to read oldest-first if requested via parameter.

Can I extract from non-English news sites (Le Monde, Spiegel, NHK)?

Yes — extraction is content-agnostic. The Readability-style algorithm works on any language; structured metadata (Open Graph, JSON-LD) is a language-neutral standard. CJK and right-to-left scripts (Arabic, Hebrew) are preserved correctly in the Markdown output.

News Article to Markdown — Strip Ads, Soft Paywalls

News-specific extraction challenges

News articles share the blog-post extraction baseline but add their own complications. First, multi-paragraph ads injected mid-article ("Story continues below") that look like content to a naïve extractor. Second, image galleries that fragment the prose into one-paragraph slides. Third, "live blog" formats where the chronology is reversed. Fourth, soft paywalls — the paragraphs are present in the HTML but visually hidden by an overlay, which our extractor reveals (because the paragraphs are public HTML; we're not bypassing anything).

Byline, dateline, and source attribution

News conventions matter for citation. We extract the byline (author or wire service), the dateline (location and date), and the publication name from page metadata, and emit them in YAML front matter. Quotes within the article keep their attribution structure. The result is Markdown you can drop into a citation manager or feed to an LLM that needs to attribute sources correctly.

Tool	Cost	Unit
Text to MD, EPUB to MD, MD to PDF, MD Cleaner, Merger, Chunker, Token Counter, Context Builder	Free	—
Word to MD	0.5 credit	per page
Excel to MD	0.5 credit	per conversion
Single URL Scrape	0.5 credit	per call
Site Crawl	1 credit	per page
Translate	1 credit	per 10 000 chars (min 1, free re-translation on cache hit)
Prompt Optimizer	1 credit	per call
System Prompt Generator	1 credit	per call
Audio to MD	2 credits	per minute
Video to MD	2 credits	per minute
YouTube to MD	2 credits	per minute
Image OCR	4 credits	per image (0 on cache hit)
PDF to MD	4 credits	per page
PPTX to MD	4 credits	per slide

News Article to Markdown — Extract Content, Skip the Noise

News-specific extraction challenges

Byline, dateline, and source attribution

Before / After

Frequently asked questions

Stop feeding garbage
to your AI

Tools

Stop sending PDFs to your AI.

How does it work?

Frequently Asked Questions

Master any tool without watching a single YouTube video

Choose your plan

How credits work

Questions

Stop feeding garbageto your AI