No credit card required

Playwright Scraper

apify/playwright-scraper

No credit card required

Crawls websites with the headless Chromium, Chrome, or Firefox browser and Playwright library using a provided server-side Node.js code. Supports both recursive crawling and a list of URLs. Supports login to a website.

Do you want to learn more about this Actor?

Get a demo

You can access the Playwright Scraper programmatically from your own Python applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4# Replace '<YOUR_API_TOKEN>' with your token.
5client = ApifyClient("<YOUR_API_TOKEN>")
6
7# Prepare the Actor input
8run_input = {
9    "startUrls": [{ "url": "https://crawlee.dev" }],
10    "globs": [{ "glob": "https://crawlee.dev/*/*" }],
11    "pseudoUrls": [],
12    "excludes": [{ "glob": "/**/*.{png,jpg,jpeg,pdf}" }],
13    "linkSelector": "a",
14    "pageFunction": """async function pageFunction(context) {
15    const { page, request, log } = context;
16    const title = await page.title();
17    log.info(`URL: ${request.url} TITLE: ${title}`);
18    return {
19        url: request.url,
20        title
21    };
22}""",
23    "proxyConfiguration": { "useApifyProxy": True },
24    "initialCookies": [],
25    "launcher": "chromium",
26    "waitUntil": "networkidle",
27    "preNavigationHooks": """// We need to return array of (possibly async) functions here.
28// The functions accept two arguments: the \"crawlingContext\" object
29// and \"gotoOptions\".
30[
31    async (crawlingContext, gotoOptions) => {
32        const { page } = crawlingContext;
33        // ...
34    },
35]""",
36    "postNavigationHooks": """// We need to return array of (possibly async) functions here.
37// The functions accept a single argument: the \"crawlingContext\" object.
38[
39    async (crawlingContext) => {
40        const { page } = crawlingContext;
41        // ...
42    },
43]""",
44    "customData": {},
45}
46
47# Run the Actor and wait for it to finish
48run = client.actor("apify/playwright-scraper").call(run_input=run_input)
49
50# Fetch and print Actor results from the run's dataset (if there are any)
51print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
52for item in client.dataset(run["defaultDatasetId"]).iterate_items():
53    print(item)
54
55# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

Playwright Scraper API in Python

The Apify API client for Python is the official library that allows you to use Playwright Scraper API in Python, providing convenience functions and automatic retries on errors.

Install the apify-client

pip install apify-client

Other API clients include:

Playwright Scraper API in JavaScript

Playwright Scraper API through CLI

Playwright Scraper API

Developer

Apify

Actor metrics

63 monthly users
14 stars
99.4% runs succeeded
21 days response time
Created in Aug 2022
Modified 3 months ago

Categories

Developer tools

For creators

Puppeteer Scraper

apify/puppeteer-scraper

Crawls websites with the headless Chrome and Puppeteer library using a provided server-side Node.js code. This crawler is an alternative to apify/web-scraper that gives you finer control over the process. Supports both recursive crawling and list of URLs. Supports login to website.

Apify

4.2k

Facebook Marketplace

shmlkv/facebook-marketplace

This is a simple scraper for Facebook Marketplace. It uses Playwright to scrape the data

Andre Sh

432

Redfin Fast Scraper

mantisus/redfin-fast-scraper

Redfin: Scrape fast, stay light! Skip bloated browser tools. My Redfin scraper extracts property data in a flash, no heavy lifting is needed. Scrape/monitor listings with ease, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov

Thefork Fast Scraper

mantisus/thefork-fast-scraper

Scrape TheFork.com quickly and easily! Skip bloated browser tools. This scraper extracts restaurant data in a flash, no heavy lifting is needed. Scrape and monitor data with ease, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov

Redfin Fast Scraper Per Results

mantisus/redfin-fast-scraper-per-results

Maksym Bohomolov

Thefork Fast Scraper Per Result

mantisus/thefork-fast-scraper-per-result

Maksym Bohomolov

Zoopla.co.uk Fast Scraper

mantisus/zoopla-actor

Zoopla.co.uk: Scrape fast, stay light! Skip bloated browser tools. My Zoopla scraper extracts property data in a flash, no heavy lifting is needed. Scrape/monitor listings with ease, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov

X Crawler

lumen_limitless/x-crawler

This project is a web scraper designed to extract user data and tweets from X (formerly known as Twitter) using Crawlee and Playwright.

lumen limitless

Zalando Price Comparator

mantisus/zalando-price-comparator

Zalando: scrape without stressing! Skip the bloated browser-based tools. My Zalando scraper extracts price and stock data from all Zalando stores. Scrape prices and remainders, all without Puppeteer or Playwright. ⚡️

Maksym Bohomolov

Website Content Crawler

apify/website-content-crawler

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗LangChain, LlamaIndex, and the wider LLM ecosystem.

Apify

21.2k

472