What landing-page extraction looks for
Marketing sites use semantic landmarks loosely (<section> everywhere, often nested), so we lean on heading hierarchy and content density. The hero headline becomes # H1; sub-heads become ## H2. Feature grids — typically a list of icon + heading + paragraph — become a Markdown list with each feature as a bullet. Testimonials become blockquotes with attribution. Pricing tables become GFM tables. FAQ accordions are expanded so every answer is visible in the output.
Use cases
Three obvious ones. Competitive analysis: dump 20 competitor homepages to Markdown, feed to an LLM, ask "which positioning angles dominate; what's differentiated; what's missing". Copy audits: grep across your own site's converted Markdown for inconsistent product names, broken value propositions, contradictory claims. RAG over marketing: index converted pages so customer-facing AI assistants can answer "what does the Pro plan include" from the same source of truth as the website.