Are script and style contents extracted?

No. ` `, ` `, ` `, and HTML comments are removed before extraction, so code, CSS, and hidden comments never end up in the output.

Are HTML entities decoded?

Yes. `&` → `&`, `<` → `<`, ` ` → a space, and numeric references like `—` are all decoded by the browser's DOMParser.

What about broken HTML?

The browser's lenient parser auto-corrects most missing closing tags. Complex ` ` nesting doesn't survive whitespace collapsing well — when tables matter, route the HTML through a dedicated tool first.

I want to keep the Markdown structure.

This tool fully strips tags. To preserve structure as Markdown, use the markdown-html-convert tool instead.

Is anything uploaded?

No. Everything runs in your browser via DOMParser + JavaScript.

Back to Developer

HTML to text — strip tags and keep only the visible text

Strip a chunk of HTML to its plain visible text. Removes script / style / noscript / comments; converts <p>, <h*>, <li>, <br>, etc. to line breaks; optionally pairs link text with its href. Toggles for collapsing whitespace, decoding HTML entities, and keeping list markers. Useful for cleaning scraped pages, NLP preprocessing, plain-text emails, or pasting articles into note apps. Everything runs in your browser.

developerextracttextMarkdown

How to use

Paste HTML into the input area. The plain-text version is updated live in the output panel. Toggle 'Append URLs after link text' to get `Article title (https://example.com/post)` style references. Turn 'Keep list markers' off to leave `<li>` contents on their own lines. Leave it on for a Markdown-ish look with `- ` or `1. `. Turn 'Collapse whitespace' off to keep the original line breaks and indentation. Leave it on when you only want the prose. Copy the result or download it as `.txt`.

FAQ

Are script and style contents extracted?: No. `<script>`, `<style>`, `<noscript>`, and HTML comments are removed before extraction, so code, CSS, and hidden comments never end up in the output.
Are HTML entities decoded?: Yes. `&` → `&`, `<` → `<`, ` ` → a space, and numeric references like `—` are all decoded by the browser's DOMParser.
What about broken HTML?: The browser's lenient parser auto-corrects most missing closing tags. Complex `<table>` nesting doesn't survive whitespace collapsing well — when tables matter, route the HTML through a dedicated tool first.
I want to keep the Markdown structure.: This tool fully strips tags. To preserve structure as Markdown, use the markdown-html-convert tool instead.
Is anything uploaded?: No. Everything runs in your browser via DOMParser + JavaScript.

Related tools

HTML Formatter — pretty / minify HTML in one click

Format HTML with two modes: pretty (indent 2/4/tabs, line breaks) and minify (collapse whitespace between tags into a single line). Built on htmlparser2 with a hand-rolled indenter that respects HTML5 void elements (`<br>` `<hr>` `<img>` `<input>` `<meta>` `<link>`...) and preserves content inside `<script>` / `<style>` / `<pre>` / `<textarea>`. Keeps DOCTYPE, comments, attribute order, and XML namespaces (`xmlns:*`). For XML use xml-format. For HTML → text use html-sanitize / markdown-html-convert. Runs entirely in your browser.

developerconversion

HTML sanitizer — strip XSS vectors with DOMPurify

Strip XSS vectors (script tags, on* handlers, javascript:/data: URLs, iframes, etc.) from untrusted HTML using DOMPurify. Three profiles — Strict / Standard / Permissive — control which tags and attributes survive. Useful before passing user-generated HTML to a CMS / forum, locking down Markdown→HTML output, or cleaning legacy HTML during a blog migration. Runs entirely in your browser — HTML never leaves your device.

developersecurityformat

Markdown ⇄ HTML converter — round-trip

Convert Markdown to HTML and HTML to Markdown in both directions. Ideal for migrating between platforms (a blog into WordPress, an HTML page into a README, etc.). Runs entirely in your browser using marked and turndown — your drafts never leave the page.

developerMarkdownconversion

Markdown link extractor — pull URLs from Markdown / HTML / text

Extract every URL or Markdown / HTML link from a chunk of Markdown, HTML, or plain text. Detects `[label](url)`, `![alt](url)`, `<a href>`, `<img src>`, and bare URLs, then classifies each result (text / image / autolink). Includes deduping, type filters, host grouping, and CSV / TSV / JSON export. Great for auditing links in an article, listing image sources, mapping internal-link structure for SEO, or harvesting references for a social post. Everything runs in your browser.

developerMarkdownextractURL