Web Scraper avatar
Web Scraper

Pricing

Pay per usage

Go to Store
Web Scraper

Web Scraper

Developed by

Apify

Maintained by Apify

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

4.5 (22)

Pricing

Pay per usage

612

Monthly users

3.6k

Runs succeeded

>99%

Response time

17 days

Last modified

3 days ago

competent_path avatar

received 401 status code

Open

Competent Path (competent_path) opened this issue
2 months ago

I tried this with the following input:

1{
2    "breakpointLocation": "NONE",
3    "browserLog": false,
4    "closeCookieModals": false,
5    "debugLog": false,
6    "downloadCss": false,
7    "downloadMedia": false,
8    "excludes": [
9        {
10            "glob": "/**/*.{png,jpg,jpeg,pdf}"
11        }
12    ],
13    "globs": [
14        {
15            "glob": ""
16        }
17    ],
18    "headless": false,
19    "ignoreCorsAndCsp": true,
20    "ignoreSslErrors": true,
21    "injectJQuery": true,
22    "keepUrlFragments": false,
23    "pageFunction": "async function pageFunction(context) {\n    const $ = context.jQuery;\n    return {html: $('html').first().html()};\n}",
24    "postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n    async (crawlingContext) => {\n        // ...\n    },\n]",
25    "preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n    async (crawlingContext, gotoOptions) => {\n        // ...\n    },\n]\n",
26    "proxyConfiguration": {
27        "useApifyProxy": true,
28        "apifyProxyGroups": [
29            "RESIDENTIAL"
30        ]
31    },
32    "runMode": "PRODUCTION",
33    "startUrls": [
34        {
35            "url": "https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2",
36            "method": "GET"
37        }
38    ],
39    "useChrome": true,
40    "waitUntil": [
41        "networkidle2"
42    ]
43}

PuppeteerCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 401 status code. 2025-02-18T22:39:55.764Z {"id":"9nWDjDToDXvA6Ny","url":"https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2","retryCount":1}

competent_path avatar

Please let me know if there is any update on this one.

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.