Pricing Dashboard Sign up
Recent
· 8 min read · MDisBetter

URL to Markdown for Obsidian: Better Web Clipping

Browser-extension web clippers are convenient and inconsistent. They sometimes capture the wrong region of the page, often produce Markdown that's heavy on inline HTML wrappers, image-data-URI bloat, and "Powered by" footers. There is a cleaner way that takes 10 seconds, requires no plugin, and produces vault-ready Markdown. Wrap a tiny script around it and you also get wikilinks, tags, and frontmatter. Here is the workflow.

The problem with most web clippers

Three recurring annoyances:

  1. Inline HTML survives. Many clippers preserve the original DOM structure including divs, spans, and inline styles. Open a clipped page two months later and the source is half Markdown, half HTML. Hard to edit, hard to grep, ugly in source mode.
  2. Image bloat. By default some clippers embed images as base64 data URIs. A single article with five photos balloons to 4-8 MB. Multiply by 50 clips per month and your vault doubles in size for no functional gain.
  3. No structural enrichment. The clipper doesn't add tags, doesn't generate a frontmatter block from page metadata, doesn't link to your daily note. Every clip lands as an isolated file with no graph integration.

You can configure your way around some of this, but the configuration UI is buried and the defaults are wrong for most users.

The mdisbetter approach in one sentence

Convert the URL to clean Markdown via our URL converter, save the file directly to your vault. The output is GitHub-Flavored Markdown — no inline HTML, no base64 — and the file is immediately indexed by Obsidian on your next focus.

Three workflows, ranked by frequency of use

Workflow 1: Quick clip (browser → vault)

For ad-hoc clipping (a few articles a day):

  1. Copy the URL of the article you're reading
  2. Open the URL converter in a pinned tab
  3. Paste, convert, click Download (saves .md with the page title as filename)
  4. Move the file to your vault (or set the download folder to your vault directly)

10 seconds end-to-end. The output is clean: H1 from page title, H2 for sections, GFM tables, fenced code blocks with language hints. No HTML.

Workflow 2: Frontmatter-enriched clip (OSS scripted)

For a knowledge base where every clip needs structured metadata. MDisBetter doesn't currently expose a programmatic URL-to-Markdown API, so the right path for scripted enrichment is to extract Markdown locally with Trafilatura (MIT-licensed OSS), then write the file with the frontmatter you want:

# pip install trafilatura
import re
from datetime import date
from pathlib import Path
import trafilatura

VAULT = Path.home() / 'Obsidian' / 'MyVault' / 'Clips'

def clip(url, tags=None):
    downloaded = trafilatura.fetch_url(url)
    md = trafilatura.extract(
        downloaded,
        output_format='markdown',
        include_links=True,
        include_tables=True,
    )
    if not md:
        print(f'EXTRACT_FAIL {url}')
        return
    meta = trafilatura.extract_metadata(downloaded)
    title = (meta.title if meta else None) or 'Untitled'
    safe = re.sub(r'[^\w\s-]', '', title)[:80].strip()
    out = VAULT / f'{safe}.md'

    fm = (
        '---\n'
        f'title: "{title}"\n'
        f'source: {url}\n'
        f'clipped: {date.today().isoformat()}\n'
        f'tags: [clipped, {", ".join(tags or [])}]\n'
        '---\n\n'
    )
    out.write_text(fm + md, encoding='utf-8')
    print(f'Saved {out.name}')

clip('https://en.wikipedia.org/wiki/Markdown', tags=['reference', 'markdown'])

Now every clip lands in your vault with proper YAML frontmatter that Obsidian's Properties view, Dataview, and graph all consume. Tags route the clip to the right corner of your knowledge base automatically.

Workflow 3: Auto-link to daily note

The killer Obsidian feature. Every time you clip an article, append a wikilink to today's daily note. The article enters your daily timeline naturally:

def clip_with_daily_link(url, tags=None):
    # ... same extraction as above ...
    out_name = out.stem  # without .md extension

    daily = VAULT / 'Daily' / f'{date.today().isoformat()}.md'
    daily.parent.mkdir(exist_ok=True)
    if not daily.exists():
        daily.write_text(f'# {date.today():%A, %B %d, %Y}\n\n## Clipped\n', encoding='utf-8')

    with open(daily, 'a', encoding='utf-8') as f:
        f.write(f'- [[{out_name}]] from {url}\n')

Open Obsidian. Today's daily note now lists every clip from today as wikilinks. Click any one to jump into the full article. Open graph view to see today's reading clustered around your daily node.

Bonus: Bookmarklet that opens the web converter

Drag this to your bookmarks bar — clicking it on any article opens the MDisBetter URL converter pre-filled with the current page's URL:

javascript:(function(){
  const u = encodeURIComponent(location.href);
  window.open('https://mdisbetter.com/convert/url-to-markdown?url=' + u, '_blank');
})();

One click takes you to the converter with the URL ready. Click Convert, click Download, drop the .md into your vault. Two clicks plus a drag — about as fast as the official Web Clipper but with cleaner output.

For a script-driven flow that goes straight to disk without opening a tab, use the Workflow 2 Python script above (Trafilatura).

Wikilink generation

Manually adding wikilinks ([[concept]]) is what makes Obsidian Obsidian — the graph emerges from those links. You can post-process clipped Markdown to auto-link known concepts:

def add_wikilinks(md, vault_path):
    """Wrap mentions of existing vault notes in [[double brackets]]."""
    notes = {f.stem for f in vault_path.glob('**/*.md')}
    for note in sorted(notes, key=len, reverse=True):  # longest first
        if len(note) < 4: continue  # skip very short names
        # word-boundary match, case-insensitive
        md = re.sub(
            rf'\b({re.escape(note)})\b',
            r'[[\1]]',
            md,
            flags=re.IGNORECASE
        )
    return md

Run this on every clip and you get automatic graph integration: any mention of an existing note becomes a link, building your network as you clip.

Image handling

By default, the URL converter outputs images as Markdown ![alt](https://...) tags pointing at the original URL. This keeps your vault tiny but creates an external dependency: if the source page deletes the image, your clip breaks.

Two alternatives, depending on your priorities:

Option A: Download images locally (post-processing)

import re, requests
from pathlib import Path

def localize_images(md, slug, attachments_dir):
    attachments_dir.mkdir(exist_ok=True)
    def repl(m):
        url = m.group(2)
        ext = Path(url.split('?')[0]).suffix or '.jpg'
        local = attachments_dir / f'{slug}_{abs(hash(url))}{ext}'
        if not local.exists():
            try:
                local.write_bytes(requests.get(url, timeout=15).content)
            except Exception:
                return m.group(0)
        return f'![{m.group(1)}](attachments/{local.name})'
    return re.sub(r'!\[([^\]]*)\]\(([^)]+)\)', repl, md)

Option B: Inline as base64 (post-processing)

For content you absolutely need to survive the source going down, encode each fetched image to base64 and rewrite the Markdown to ![alt](data:image/jpeg;base64,...). Self-contained, but the file size grows a lot — use sparingly.

Comparison: typical browser-extension clipper vs this approach

FeatureBrowser-extension clippermdisbetter
SetupInstall plugin + extensionNone for the web tool, ~5 min for OSS scripts
Output cleanlinessOften mixed Markdown + HTMLPure GFM Markdown
Image handlingSometimes base64 (heavy) by defaultURL refs (light) by default
FrontmatterManualScriptable (Trafilatura)
Daily note integrationManualScriptable
Wikilink injectionManualScriptable
Custom selectorLimitedYes via BeautifulSoup

Working with PDFs in Obsidian?

For PDF-to-vault workflows (research papers, reports, legal documents), see PDF to Markdown for Obsidian and the corresponding vault setup guide. The same wikilink and daily-note patterns apply — the only difference is the source format.

Folder structure inside the vault

How you organize clips inside the vault matters more than how you get them there. Three patterns that scale:

Inbox + curated

All clips land in Clips/Inbox/. Once a week, you triage: move keepers to Clips/Reference/ with proper frontmatter and tags, archive the rest to Clips/Archive/, delete obvious duplicates. Mimics email triage. Works well for high-volume clippers.

By topic

Clips land in topic folders directly: Clips/AI/, Clips/Programming/, Clips/Business/. Faster to find later, requires deciding the topic at clip time. Works well when your topics are stable.

By date

Clips land in date folders: Clips/2026/05/. Mirrors how you took the action — "the article I clipped last May" maps directly to a folder. Combined with the daily-note workflow this becomes very natural.

Pick one. Don't mix two. Mixing creates ambiguity that you'll later resent.

Excalidraw and Canvas integration

Obsidian's Canvas plugin lets you arrange notes spatially on an infinite whiteboard. Clipped articles become first-class objects in canvases — drop a clip on a canvas, link it to your own notes, draw arrows showing how the ideas connect. Especially powerful for synthesis work (literature reviews, market research, idea development) where the spatial layout helps you see connections.

The clipper integration: the Markdown files we produce work in Canvas without any modification. Drop the file into Canvas, resize, position. The first H2 of the clip shows as the canvas card title; the rest is browseable.

Search and graph benefits

Obsidian's search and graph features depend on the Markdown being clean. With clipper output that mixes inline HTML, full-text search hits inside <span> wrappers and <div> attributes that are invisible in reading mode but very visible to the search index — leading to false positives and confusing search rankings. With pure Markdown clips, search behaves predictably: every hit is a hit on text the user can see. The graph is similarly cleaner: wikilinks are the only edge type, no spurious edges from clipper-injected metadata. Two months in, the difference compounds — your vault behaves like one coherent knowledge base instead of two parallel ones (your notes vs. your clipper output).

Backup and portability

One advantage of the Markdown-first approach over plugin-driven clippers: your clips are pure text in your filesystem. Backup is whatever your filesystem backup is — Time Machine, Backblaze, rsync to a NAS. Migration to another tool (Logseq, plain VS Code, Bear, iA Writer, anything that reads Markdown) is a folder copy.

Most browser-extension clippers' outputs are technically also Markdown, but the inline-HTML wrappers and base64 image bloat make them harder to migrate cleanly. Plain GFM Markdown round-trips through any tool that handles Markdown at all.

What the bookmarklet won't do

Two things to know about the bookmarklet:

  1. It opens the converter in a new tab pre-filled with the URL — you still click Convert and Download. To go fully hands-off, set up a Folder Action / Hazel rule that moves new .md downloads into your vault automatically. On Mac, Hazel does this in 30 seconds; on Windows, PowerToys can be configured similarly.
  2. The web converter fetches anonymously, with no access to your browser cookies. For login-required content, use the OSS Workflow 2 script with requests (or httpx) plus your session cookie or bearer token, then run Trafilatura on the response.

Recommendation

For one-off clipping: bookmarklet plus the web UI. For systematic knowledge-management workflows: the Trafilatura-based Python script with frontmatter and daily-note appending. Either way, you'll get cleaner clips than most browser-extension clippers produce. The output is cleaner, the integration is deeper, and the friction is lower. See also URL to Markdown for Notion for the same approach in a Notion workflow.

Frequently asked questions

Does this work with Obsidian Sync (encrypted multi-device sync)?
Yes — Obsidian Sync operates on the file level, so any Markdown you save into the vault propagates to your other devices automatically. The conversion happens once on the desktop where you saved the file; the synced devices just receive the resulting Markdown.
Can I trigger the conversion from inside Obsidian itself?
Yes via Obsidian's Templater or Shell Commands community plugins, which let you bind a hotkey to run a script. Bind it to call the Trafilatura clipper script with the URL on your clipboard. End result: hotkey in Obsidian, clip lands in your vault, daily note updated, all in under a second.
What happens to broken links in the source page?
Broken links in the source HTML are preserved as broken links in the Markdown. The converter doesn't validate or rewrite them. If a clip has a broken link two years later, it was already broken at clip time. For long-term archival, post-process to download images locally so the clip stays self-contained even if the source disappears.
Does MDisBetter offer a programmatic API for Obsidian-style clipping?
Not today. The web tool at /convert/url-to-markdown is the supported surface for one-off clipping. For scripted, recurring, or auth-walled clipping (the Workflow 2/3 cases), Trafilatura plus a few lines of Python is the right path — MIT-licensed, free, fully under your control.