Pricing

Pay per usage

Go to Store

PDF Text Extractor

Try for free

Developed by

Jiří Moravčík

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

5.0 (1)

Pricing

Pay per usage

Total users

726

Monthly users

Runs succeeded

>99%

Issues response

22 hours

Last modified

2 months ago

Integrations

Automation

Back to issues Create new issue

Output data is redacted

Open

andideng opened this issue

Hi, a lot of the data I extract from PDF are redacted. Is there a way to get around this?

Jiří Moravčík (jirimoravcik)

Hello, can you be more specific please, e.g. provide some examples that aren't in the extracted data? It's possible that the format of the PDF is just difficult to parse and the internal library struggles with that - there's sadly no way around that.

Add comment

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

Onidivo Technologies

375

PDF Extractor 2.0

jupri/pdf-extractor-2-0

💫 Extract PDF Document Contents including Metadata, Images, Pages, Tables, Attachments, etc.

cat

PDF Text Extractor

sami_apify/PDF-Text-Extractor

This actor downloads PDFs from provided URLs, extracts text content from them, and saves the extracted data into an Apify dataset. It’s ideal for scraping and processing PDFs available online.

sami

HTML to PDF converter

apify/html-to-pdf-converter

Convert HTML string to A4 PDF.

Apify

4.3

HTML to PDF Converter

jancurn/url-to-pdf

Loads a web page in headless Chrome using Puppeteer and prints it to PDF. The input is a JSON object and output is a PDF file.

Jan Čurn

472

Website To PDF Converter

louisdeconinck/website-to-pdf-converter

Convert websites to high-quality PDF documents with customizable options. This powerful actor allows you to transform website pages with both static HTML and dynamic content into professional-grade PDFs, offering a wide range of customization features such as page format, orientation, margins, …

Louis Deconinck

5.0

Markdown Converter

jindrich.bar/markdown-converter

A simple Actor for converting pdf / doc / docx files to Markdown.

Jindřich Bär

HTML string to PDF

mhamas/html-string-to-pdf

Convert HTML string to A4 PDF.

Matej Hamas

Google Slides Replacer

kamil.stus/google-slides-replacer

Automate the creation of Google Slides presentations from a template, with support for dynamic text replacement.

Kamil Štus

HTML to PDF Converter Pro 🔄

powerful_bachelor/html-to-pdf-converter-pro

🔄 Convert web pages to high-quality PDFs with special canvas element handling! Perfect for 📄 documentation, 🖨️ printing, and 🔒 archiving. Features include batch processing and flexible page settings. Transform your web content into professional PDFs! 🚀