Website Checker Workload avatar
Website Checker Workload

Pricing

Pay per usage

Go to Store
Website Checker Workload

Website Checker Workload

Developed by

Lukáš Křivka

Maintained by Community

Creates reasonable workloads for analyzing any website with the Website Checker actor and combines the resulting data. This is the easiest way to analyze any website for compute unit usage and anti-scraping blocking.

0.0 (0)

Pricing

Pay per usage

2

Monthly users

1

Runs succeeded

>99%

Last modified

2 years ago

Creates reasonable workloads for analyzing any website with Website Checker and combines the resulting data. This is the easiest way to analyze any website for compute units usage and blocking.

This actor runs a Website Checker for each proxy group and for both browser/Puppeteer and Cheerio scraper. Those checks are run in parallel with reasonable default values and the output of all checkers in combined into a single output breakdown. This gives you quite a nice idea how difficult and costly will be scraping the site with different methods and can save precious time you would spend with manual checks.

Input

FieldTypeDefaultDescription
websiteStringhttps://apify.comWebsite URL where you want to start checking
runBrowserBooleantrueRun the checker with browser
runCheerioBooleantrueCheck with Cheerio
proxyGroupsArray['auto', 'BUYPROXIES84958']List of proxy groups you want to test. Can be also auto to run with all proxies
maxPagesPerCheckNumber200Max pages per each check
runInParallelBooleantrueWhat to scrape from each page, default is "posts" the other option is "comments"

Output

The output is saved to the default Key-Value store as OUTPUT record. It is a combined output from all Website Checker runs with added spent compute units.

For example for input consisting of

1"runBrowser": true,
2"runCheerio": true,
3"proxyGroups": ["auto", "BUYPROXIES84958"]

The actor will run 4 checkers with all possible combinations:

1{
2    "puppeteer/auto": {
3        "computeUnits": 0.45,
4        "pagesPerComputeUnit": 444,
5        "timeouted": 0,
6        "failedToLoadOther": 9,
7        "accessDenied": 0,
8        "recaptcha": 0,
9        "distilCaptcha": 24,
10        "statusCodes": {
11            "200": 3,
12            "401": 2,
13            "403": 5,
14            "405": 24
15        },
16        "total": 43
17    },
18    "puppeteer/BUYPROXIES84958": {
19        "computeUnits": 0.45,
20        "pagesPerComputeUnit": 444,
21        "timeouted": 0,
22        "failedToLoadOther": 9,
23        "accessDenied": 0,
24        "recaptcha": 0,
25        "distilCaptcha": 24,
26        "statusCodes": {
27            "200": 3,
28            "401": 2,
29            "403": 5,
30            "405": 24
31        },
32        "total": 43
33    },
34    "cheerio/auto": {
35        "computeUnits": 0.05,
36        "pagesPerComputeUnit": 4000,
37        "timeouted": 0,
38        "failedToLoadOther": 9,
39        "accessDenied": 0,
40        "recaptcha": 0,
41        "distilCaptcha": 24,
42        "statusCodes": {
43            "200": 3,
44            "401": 2,
45            "403": 5,
46            "405": 24
47        },
48        "total": 43
49    },
50    "cheerio/BUYPROXIES84958": {
51        "computeUnits": 0.05,
52        "pagesPerComputeUnit": 4000,
53        "timeouted": 0,
54        "failedToLoadOther": 9,
55        "accessDenied": 0,
56        "recaptcha": 0,
57        "distilCaptcha": 24,
58        "statusCodes": {
59            "200": 3,
60            "401": 2,
61            "403": 5,
62            "405": 24
63        },
64        "total": 43
65    },
66}

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.