Pricing

$4.00 / 1,000 results

Go to Store

Browserless Scraper Pro

Try for free

Developed by

DataVoyantLab

Browserless Scraper Pro is designed to automate common web tasks such as web scraping, taking screenshots, and generating PDFs without the need for manual browser interaction.

0.0 (0)

Pricing

$4.00 / 1,000 results

Total users

Monthly users

Runs succeeded

>99%

Last modified

5 months ago

Automation

Lead generation

Simplify Your Web Interactions with Browserless Scraper Pro

Browserless Scraper Pro, inspired by the functionality of Browserless, ScrapingBee ... but tailored to provide a unique, user-friendly experience. This tool is designed to automate common web tasks such as web scraping, taking screenshots, and generating PDFs without the need for manual browser interaction.

Challenges in Web Interactions for AI

Building AI applications that interact with the web presents several challenges:

Dynamic Content: Modern websites often use client-side rendering and lazy loading, requiring tools that can execute JavaScript and wait for page hydration to access full content.
Infrastructure Overhead: Managing a fleet of headless browsers for scraping at scale involves complexities related to resource contention, reliability, and cold starts.
Lack of Web APIs: Many sites lack proper API access, forcing developers to create and maintain custom scrapers.

This actor is designed to tackle these challenges head-on, providing a robust solution for automating web interactions.

Key Features

Web Scraping
Effortlessly extract data from websites in multiple formats including HTML, readability-enhanced content, cleaned HTML, and Markdown. This feature is perfect for data collection and analysis, allowing users to choose the format that best suits their needs.
Screenshot Capture
Obtain high-resolution screenshots of entire web pages or specific sections. This feature includes options for capturing the full page or just the viewport, making it ideal for visual documentation, quality assurance testing, and sharing visuals across teams.
PDF Generation
Convert web pages into well-formatted PDF documents with options for custom delays to handle dynamic content. This is suitable for archiving articles, generating reports, or saving web content for offline use.
Flexible Proxy Configuration
Configure proxy settings to manage and rotate IPs during scraping activities to avoid detection and blocking by target websites. This feature supports both custom proxies and Apify's built-in proxy solutions.
Customizable Delays and Timeouts
Set custom delays between requests to manage scraping speed and comply with website rate limits, ensuring reliable data extraction without overloading the website servers. Additionally, specify a maximum timeout for operations to prevent excessive delays.
Comprehensive Output
Receive detailed JSON outputs including HTML content, metadata, and extracted links, which provide insights into the structure and content of the target web pages.

How It Works

Select the Task:
Choose from scraping data, capturing a screenshot, or generating a PDF.
Submit the URLs:
Provide the URLs of the target webpages.
Customize Options:
Set parameters such as page size for PDFs, full-page or viewport-specific screenshots, scraping selectors, optional delay for operations, and maximum timeout.
Proxy Configuration:
Configure proxy settings if necessary, with a default option to use Apify Proxy (Special apify proxies are not supported yet)
Receive Results:
The tool processes your request and delivers the output in the desired format.

Usage Examples

Web Scraping Input

Scrape Input

{
    "operation": "scrape",
    "urls": ["https://example.com", "https://example2.com"],
    "format": "html", // Optional, defaults to 'html'. Other formats available: 'readability', 'cleaned_html', 'markdown'
    "delay": 5000, // Optional, Delay before scraping (in milliseconds)
    "maxTimeout": 30 // Optional, Maximum timeout for the operation (in seconds)
}

Screenshot Capture Input

{
  "operation": "screenshot",
  "urls": ["https://example.com"],
  "fullPage": true,  // Optional, defaults to false
  "delay": 3000,      // Optional, Delay before scraping (in milliseconds)
  "maxTimeout": 30 // Optional, Maximum timeout for the operation (in seconds)
}

PDF Generation Input

{
  "operation": "pdf",
  "urls": ["https://example.com"],
  "delay": 3000,      // Optional, Delay before scraping (in milliseconds)
  "maxTimeout": 30 // Optional, Maximum timeout for the operation (in seconds)
}

Example Output for Web Scraping

Below is an example of the JSON output from a web scraping operation. This output includes the scraped HTML content, metadata about the scrape, and a list of links found on the page.

{
  "content": {
    "html": "<html lang=\"en\" data-theme=\"light\" style=\"color-scheme: light;\"><head>.....</body></html>"
  },
  "metadata": {
    "statusCode": 200,
    "title": "datavoyantlab (DataVoyantLab) · Apify",
    "ogImage": "https://apify.com/og-image/user?username=datavoyantlab",
    "ogTitle": "datavoyantlab (DataVoyantLab) · Apify",
    "urlSource": "https://apify.com/datavoyantlab",
    "description": "🔍 Web Data Extraction Specialist | Building tomorrow's automation tools today | Turning data into decisions 💡",
    "ogDescription": "🔍 Web Data Extraction Specialist | Building tomorrow's automation tools today | Turning data into decisions 💡",
    "language": "en",
    "timestamp": "2025-01-12T22:12:40.497Z"
  },
  "links": [
    {
      "url": "https://apify.com/datavoyantlab#main-content",
      "text": "Skip to content"
    },
    // Additional links omitted for brevity
  ]
}

This output is structured to provide comprehensive details about the scraped page, including the HTML content, response status, and various metadata elements like the page title, description, and the original URL. The links array contains objects representing links found on the page, each with a URL and the link text.

On this page

Simplify Your Web Interactions with Browserless Scraper Pro

Share Actor:

Web Scraper

apify/web-scraper

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

Apify

91K

4.4

HTML to PDF Converter Pro 🔄

powerful_bachelor/html-to-pdf-converter-pro

🔄 Convert web pages to high-quality PDFs with special canvas element handling! Perfect for 📄 documentation, 🖨️ printing, and 🔒 archiving. Features include batch processing and flexible page settings. Transform your web content into professional PDFs! 🚀

Powerful Bachelor

Page Printer

marco.gullo/page-printer

Performs screenshots or print web pages in PDF format.

Marco Gullo

PDF Text Extractor

sami_apify/PDF-Text-Extractor

This actor downloads PDFs from provided URLs, extracts text content from them, and saves the extracted data into an Apify dataset. It’s ideal for scraping and processing PDFs available online.

sami

Extract-any-webpage-content-for-llm

ai-developer/extract-any-webpage-content-for-llm

Fast and easy way to extract data from any webpage and are LLM friendly. The tool lets you easily extract content from any website. Ideal for researchers, marketers, and developers.

aideveloper

492

Website Content Crawler

apify/website-content-crawler

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Apify

65K

4.2

HTML/Website Media Scraper

hlymrk/html-web-media-scraper

The Website Media scraper extracts all media files, i.e images, videos, audio, and other related media elements, from multiple websites. It then provides the corresponding descriptions or the alt="" content. You'll need to use proxies to run this actor for some websites with bot blocking features.

hlymrk

173

1.0

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

Jiří Moravčík

723

5.0

Website Media Link Scraper

thenetaji/website-media-link-scraper

Quickly find video, audio, docs, pdf, image and more links from websites using this fast and lightweight web crawler. No browser needed—just clean and efficient media extraction.

thenetaji

4.1

Playwright Scraper

apify/playwright-scraper

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

Apify

2.1K

3.6