Similarweb Quick Scraper
Pay $10.00 for 1,000 results
Similarweb Quick Scraper
Pay $10.00 for 1,000 results
A quick scraper for Similarweb. Get needed data instantly for domains of your choice. Export accumulated data into formats such as HTML, JSON, or Excel.
I've tried running the scrapper but no positive result comes up. THis is the log:
2023-09-01T14:13:55.960Z ACTOR: Pulling Docker image from repository. 2023-09-01T14:13:57.049Z ACTOR: Creating Docker container. 2023-09-01T14:13:57.239Z ACTOR: Starting Docker container. 2023-09-01T14:14:00.022Z INFO System info {"apifyVersion":"3.1.7","apifyClientVersion":"2.7.1","crawleeVersion":"3.4.0","osType":"Linux","nodeVersion":"v16.20.1"} 2023-09-01T14:14:00.716Z INFO HttpCrawler: Starting the crawl 2023-09-01T14:14:01.001Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:01.002Z {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","retryCount":1} 2023-09-01T14:14:01.081Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:01.082Z {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","retryCount":1} 2023-09-01T14:14:04.732Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:04.733Z {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","retryCount":2} 2023-09-01T14:14:04.866Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:04.868Z {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","retryCount":2} 2023-09-01T14:14:10.064Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:10.065Z {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","retryCount":3} 2023-09-01T14:14:10.219Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:10.221Z {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","retryCount":3} 2023-09-01T14:14:14.532Z ERROR HttpCrawler: Request failed and reached maximum retries. HTTPError: Response code 403 (Forbidden) 2023-09-01T14:14:14.540Z at Request. (/usr/src/app/node_modules/got-cjs/dist/source/as-promise/index.js:91:42) 2023-09-01T14:14:14.542Z at Object.onceWrapper (node:events:628:26) 2023-09-01T14:14:14.542Z at Request.emit (node:events:525:35) 2023-09-01T14:14:14.544Z at Request._onResponseBase (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:739:22) 2023-09-01T14:14:14.545Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2023-09-01T14:14:14.546Z at async Request._onResponse (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:778:13) {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","method":"GET","uniqueKey":"https://similarweb.com/website/jan3.com"} 2023-09-01T14:14:15.188Z ERROR HttpCrawler: Request failed and reached maximum retries. HTTPError: Response code 403 (Forbidden) 2023-09-01T14:14:15.193Z at Request. (/usr/src/app/node_modules/got-cjs/dist/source/as-promise/index.js:91:42) 2023-09-01T14:14:15.194Z at Object.onceWrapper (node:events:628:26) 2023-09-01T14:14:15.195Z at Request.emit (node:events:525:35) 2023-09-01T14:14:15.196Z at Request._onResponseBase (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:739:22) 2023-09-01T14:14:15.197Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2023-09-01T14:14:15.197Z at async Request._onResponse (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:778:13) {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","method":"GET","uniqueKey":"https://similarweb.com/website/01.xyz"} 2023-09-01T14:14:15.198Z INFO HttpCrawler: All requests from the queue have been processed, the crawler will shut down. 2023-09-01T14:14:15.558Z INFO HttpCrawler: Crawl finished. Final request statistics: {"requestsFinished":0,"requestsFailed":2,"retryHistogram":[null,null,null,2],"requestAvgFailedDurationMillis":1014,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":7,"requestTotalDurationMillis":2028,"requestsTotal":2,"crawlerRuntimeMillis":15054} 2023-09-01T14:14:15.559Z INFO HttpCrawler: Error analysis: {"totalErrors":2,"uniqueErrors":1,"mostCommonErrors":["2x: Response code 403 (Forbidden) (/usr/src/app/node_modules/got-cjs/dist/source/as-promise/index.js:91:42)"]}
Please use residential US proxy
Should I always use the proxy from now on?
I have attached the change.
Yes, you should (in that scraper). Is it working now?
pt., 1 wrz 2023 o 17:17 Daniel Jean < topic+uhhznnoc2sttr5np5d@reply.apify.com> napisał(a):
Yes, it's working. I'm closing this issue.