Import Web Pages into Notion as Markdown (Guide)
Notion has a Web Clipper. It is also one of the most-complained-about features in the product. It mangles code blocks, flattens nested lists, drops captions, breaks tables, and reformats articles into a shape that is somehow worse than both the source page and what you'd manually type in. There is a much cleaner path: convert the URL to Markdown first, then use Notion's native Markdown import. The result is full Notion blocks (proper headings, tables, callouts, code blocks with language tags) that are 100% editable.
Why the Web Clipper disappoints
Three concrete failure modes that drive users to look for alternatives:
- Code blocks lose language. A Python snippet on a tech blog comes in as a generic preformatted block. No syntax highlighting, no copy-with-language preserved.
- Tables become bullet lists. Anything table-shaped — comparison tables, pricing tables, API parameter tables — collapses into nested bullets that bear no resemblance to the original layout.
- Nested lists flatten. Three-level nested lists (common in tutorial articles) come in as a single flat level. The structure that made the article readable is gone.
Plus the clipped page lives in a quasi-readonly state — you can edit it, but the structure resists modification because it's full of clipper-injected wrapper blocks.
The Markdown-first approach
Notion's native Markdown import (under Import → Markdown & CSV) consumes .md files and produces real Notion blocks. Heading 1 becomes a Heading 1 block. A code fence becomes a Code block with the language preserved. A GFM table becomes a Database (or, with one option flipped, a Table block). A bulleted list becomes a properly-nested toggleable list.
Combine this with our URL converter and you get a 4-step workflow that produces dramatically cleaner Notion pages than the Web Clipper ever did.
The 4-step workflow
Step 1: Convert the URL
Open the URL converter, paste the source URL, click convert, download the .md file. 10 seconds.
Try this on a tutorial-style page like https://docs.python.org/3/tutorial/datastructures.html. The output Markdown preserves Python's three-level heading hierarchy, all code samples with their ```python language hints, and all the inline notes.
Step 2: Open Notion's import dialog
In Notion: top-left ... menu → Settings & Members → Settings → Import → Markdown & CSV. Or, from any page: type /import and select Markdown.
Step 3: Drop the file
Drop the .md file into the import dialog. Notion processes it in 1-3 seconds.
Step 4: Verify and re-parent
The imported page lands in your workspace root. Drag it under whatever parent page or database you want. Open it; you'll see proper Notion blocks: H1/H2/H3, code blocks with language tags, GFM tables as Notion Tables, callouts where the source had them, etc.
What survives the round-trip
| Element | URL → Markdown → Notion | Web Clipper |
|---|---|---|
| Headings (H1-H6) | Preserved as proper H1-H3 (Notion caps at 3) | Often flattened to H1 |
| Code blocks | Code block + language preserved | Generic code block, no language |
| Tables | Notion table | Bullet list |
| Inline code | Inline code | Inline code |
| Bold/italic | Preserved | Preserved |
| Nested lists | Up to 4 levels deep | Often flattened |
| Images | Image block (URL ref) | Image block (re-uploaded) |
| Internal links | External URL preserved | External URL preserved |
| Block quotes | Quote block | Sometimes preserved |
Programmatic import via Notion API + Trafilatura
If you're importing tens or hundreds of URLs, you'll want to script it. MDisBetter doesn't currently expose a programmatic URL-to-Markdown API, so the right path for automation is to extract Markdown locally with Trafilatura (MIT-licensed OSS), then push to Notion via Notion's official API. You'll need a Notion integration token and a destination page or database ID.
# pip install trafilatura notion-client
import os
import trafilatura
from notion_client import Client
NOTION_TOKEN = os.environ['NOTION_TOKEN']
PARENT_ID = 'your-notion-page-or-database-id'
notion = Client(auth=NOTION_TOKEN)
def url_to_notion(url):
# Step 1: get Markdown
downloaded = trafilatura.fetch_url(url)
md = trafilatura.extract(
downloaded,
output_format='markdown',
include_links=True,
include_tables=True,
)
if not md:
print(f'EXTRACT_FAIL {url}')
return
meta = trafilatura.extract_metadata(downloaded)
title = (meta.title if meta else None) or url
# Step 2: convert Markdown to Notion blocks
# (use a markdown-to-notion-blocks helper, e.g. martian or notion-md-converter)
from notion_md_converter import md_to_blocks
blocks = md_to_blocks(md)
# Step 3: create the page
notion.pages.create(
parent={'page_id': PARENT_ID},
properties={'title': [{'text': {'content': title}}]},
children=blocks,
)
print(f'Imported {title}')
url_to_notion('https://docs.python.org/3/tutorial/datastructures.html')Notion's API has a 100-block-per-create limit. For very long pages, split blocks into batches and use blocks.children.append for the rest.
Bulk import patterns
For a one-shot bulk import (say, your reading list of 200 URLs):
from pathlib import Path
urls = Path('reading-list.txt').read_text().splitlines()
for i, url in enumerate(urls, 1):
url = url.strip()
if not url:
continue
try:
url_to_notion(url)
print(f'[{i}/{len(urls)}] OK')
except Exception as e:
print(f'[{i}/{len(urls)}] FAIL {url}: {e}')For continuous ingestion (e.g., automatically import every new bookmark from a source like Pocket, Raindrop, Pinboard), poll the source's API every hour, diff against what you've already imported, run the import for the new ones. Five minutes of glue code, zero manual work after that.
Notion database vs page
Two destinations to consider:
Page (parent_id is a page)
Each clip becomes a child page. Good for a simple inbox or reading log.
Database (parent_id is a database)
Each clip becomes a database row with the page content as the body. Add columns for Source URL, Tags, Read Status, Date Clipped. Filter and sort like any Notion database.
from datetime import date
notion.pages.create(
parent={'database_id': DATABASE_ID},
properties={
'Name': {'title': [{'text': {'content': title}}]},
'Source URL': {'url': url},
'Tags': {'multi_select': [{'name': 'clipped'}]},
'Date': {'date': {'start': date.today().isoformat()}},
'Status': {'select': {'name': 'Unread'}},
},
children=blocks,
)This is the read-it-later workflow people pay $10/month for, built in 30 lines on top of free Notion + the OSS converter.
Common gotchas
Notion caps headings at H3
Markdown-to-Notion converters typically downgrade H4-H6 to H3 + bold. Acceptable for most articles; review if you need exact heading levels (rare).
Code block language hints
Notion supports a fixed list of languages. Languages outside that list (e.g., "hcl" or some niche DSL) fall back to plain text. Most common languages (Python, JavaScript, TypeScript, Go, Rust, Bash, SQL, JSON, YAML) work as expected.
Image hosting
By default the converter outputs image URLs pointing at the source. Notion fetches and re-hosts these on import. If the source URL goes 404 later, the Notion image is fine because Notion already grabbed the bytes.
Long pages
Notion gets sluggish on pages over ~5000 blocks. For very long source articles, consider splitting at H1 boundaries before importing — one Notion page per top-level section.
Working with PDFs?
For PDF documents going into Notion (research papers, contracts, reports), use PDF to Markdown for Notion and see PDF to Markdown for Notion import. Same import flow, different source format. Many users mix the two: web articles via this guide, PDF reports via the PDF guide, both ending up as proper Notion pages in the same workspace.
Real-world workflows we've seen users build
Read-it-later replacement
One common setup: a Notion database called "Inbox" with columns for URL, Title, Date Added, Status (Unread/Reading/Read), Tags, and Notes. A bookmarklet on the user's browser opens the MDisBetter web tool with the URL pre-filled; the user clicks Convert + Download, then the resulting .md is dropped into Notion's Markdown import. For full automation, swap the bookmarklet for a small webhook on a free Vercel function that runs the Trafilatura + Notion API import script. Total setup: one afternoon. End result: a free, faster, cleaner Pocket replacement that lives inside Notion alongside the rest of the user's knowledge work.
Competitive intelligence
Marketing teams clip every public article their competitor publishes (their blog feed, press releases, help center articles). Each clip lands in a Notion database tagged by competitor and topic, ready to query for trends. Clean Markdown ensures the imported pages are searchable inside Notion (which the original Web Clipper's output often isn't, because the inline HTML breaks Notion's full-text indexer on those blocks).
Course-prep notes
Teachers and trainers clip articles, blog posts, documentation pages they want to reference in upcoming sessions. Each clip lands in a database keyed to the session it'll support. Frontmatter tags route them to the right session. When the session approaches, filtering the database by session-tag pulls all the prep notes together.
Templates and toggles
Notion's import treats Markdown headings strictly: H1 stays H1, H2 stays H2, H3 stays H3. If you want imported content to live inside an existing template (e.g., a header block + properties + the imported content as a child), import first as a standalone page, then drag the body blocks into the template page. Notion's drag-multi-select works on entire imported documents.
Markdown's > blockquote becomes a Notion Quote block. Triple-backtick code blocks with language hints become Notion Code blocks with the matching language selected. Markdown tables become Notion Tables. Markdown task lists (- [ ] item) become Notion to-do blocks with checkboxes. Most users find this fidelity genuinely surprising the first time they see it — Notion's native Markdown import is dramatically better than the Web Clipper at preserving structure.
Sync vs. snapshot
Imported pages are snapshots of the source URL at import time. They do not auto-update when the source changes. If you need ongoing sync (e.g., always have the latest version of a frequently-updated docs page in Notion), schedule the import script to re-run on a cadence, with logic to delete the old version before importing the new one. Five lines of code. For most users, snapshot is what they want — they're capturing the article as they read it, not subscribing to it.
Recommendation
For ad-hoc clipping: convert URL → Markdown via the MDisBetter web tool → drag into Notion. For systematic knowledge management: build the Python script around Trafilatura plus the Notion API. Either way, you'll get cleaner pages than the Web Clipper produces and you'll have full control over destination, properties, and tagging. The Web Clipper is convenient; this is correct.