No. pdfjs-dist runs in your browser; PDFs never leave your device.

Can it extract text from scanned PDFs?

No. PDFs without a text layer (image-only scans) need OCR, which this tool does not perform.

Are columns / tables preserved?

Only raw text items are read in document order, so multi-column layouts and table structure are flattened. Treat the output as plain text.

Does it work on password-protected PDFs?

No. Unlock them first with the PDF unlock tool.

Back to PDF

PDF text extract — export pages to .txt

Extract plain text from PDF files entirely in the browser via pdfjs-dist getTextContent. Each PDF becomes its own .txt file; batch downloads ship as a ZIP. Page-break markers are optional.

pdfextracttext

How to use

Drop PDFs (batch supported). Click Extract — pdfjs-dist iterates the text items on each page and collects them into one .txt per file. Copy or download files individually, or grab everything as a ZIP. Toggle Insert page breaks to add `---- Page N ----` separators between pages.

FAQ

Are PDFs uploaded?: No. pdfjs-dist runs in your browser; PDFs never leave your device.
Can it extract text from scanned PDFs?: No. PDFs without a text layer (image-only scans) need OCR, which this tool does not perform.
Are columns / tables preserved?: Only raw text items are read in document order, so multi-column layouts and table structure are flattened. Treat the output as plain text.
Does it work on password-protected PDFs?: No. Unlock them first with the PDF unlock tool.

Related tools

PDF text search — full-text search across multiple PDFs

Search several PDFs at once and inspect every match with its page number and surrounding context. Toggle case sensitivity, word boundaries (\b), regular expressions, and a multi-line mode. Adjust the query or context width (10–200 chars) and the matches refresh live. Each file shows a hit count and the full result set can be downloaded as CSV. Uploaded PDFs never leave the browser.

pdftextextract

PDF to JPG — convert each page to an image

Upload a PDF and convert each page to JPEG (.jpg). Pick scale and quality, save pages individually, or download everything as a ZIP. Transparency is flattened to white, which keeps files small and easy to share on social networks or blogs. Runs entirely in your browser — your PDF stays local.

pdfimageconversion

PDF Image Extract — export embedded images as PNG

Extract every embedded image from a PDF as a PNG file via pdfjs-dist. Each page's operator list is scanned for `paintImageXObject` / `paintInlineImageXObject` / `paintImageXObjectRepeat`, and `page.objs` is read to recover the ImageBitmap or raw RGB(A) / grayscale buffer, then rendered to Canvas and saved as PNG. Optionally deduplicates identical images that appear on multiple pages. Multiple PDFs ship as a single ZIP. Files are named `<source>-page<N>-img<M>.png`. Password-protected PDFs are flagged with a CTA to pdf-unlock. Everything happens inside your browser.

pdfimageextract

PDF metadata strip — Title / Author / XMP at once

Remove the PDF Info dictionary (Title / Author / Subject / Keywords / Creator / Producer / CreationDate / ModDate) and the XMP metadata stream entirely in the browser via pdf-lib. The page content is untouched. Supports batch processing and a single ZIP download.

pdfEXIF