Similarweb Quick Scraper avatar

Similarweb Quick Scraper

Try for free

Pay $10.00 for 1,000 results

View all Actors
Similarweb Quick Scraper

Similarweb Quick Scraper

mscraper/similarweb-quick-scraper
Try for free

Pay $10.00 for 1,000 results

A quick scraper for Similarweb. Get needed data instantly for domains of your choice. Export accumulated data into formats such as HTML, JSON, or Excel.

EH

Runs with no result

Closed

electric_harpsichord opened this issue
a year ago

I've tried running the scrapper but no positive result comes up. THis is the log:

2023-09-01T14:13:55.960Z ACTOR: Pulling Docker image from repository. 2023-09-01T14:13:57.049Z ACTOR: Creating Docker container. 2023-09-01T14:13:57.239Z ACTOR: Starting Docker container. 2023-09-01T14:14:00.022Z INFO System info {"apifyVersion":"3.1.7","apifyClientVersion":"2.7.1","crawleeVersion":"3.4.0","osType":"Linux","nodeVersion":"v16.20.1"} 2023-09-01T14:14:00.716Z INFO HttpCrawler: Starting the crawl 2023-09-01T14:14:01.001Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:01.002Z {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","retryCount":1} 2023-09-01T14:14:01.081Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:01.082Z {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","retryCount":1} 2023-09-01T14:14:04.732Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:04.733Z {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","retryCount":2} 2023-09-01T14:14:04.866Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:04.868Z {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","retryCount":2} 2023-09-01T14:14:10.064Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:10.065Z {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","retryCount":3} 2023-09-01T14:14:10.219Z WARN HttpCrawler: Reclaiming failed request back to the list or queue. Response code 403 (Forbidden) 2023-09-01T14:14:10.221Z {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","retryCount":3} 2023-09-01T14:14:14.532Z ERROR HttpCrawler: Request failed and reached maximum retries. HTTPError: Response code 403 (Forbidden) 2023-09-01T14:14:14.540Z at Request. (/usr/src/app/node_modules/got-cjs/dist/source/as-promise/index.js:91:42) 2023-09-01T14:14:14.542Z at Object.onceWrapper (node:events:628:26) 2023-09-01T14:14:14.542Z at Request.emit (node:events:525:35) 2023-09-01T14:14:14.544Z at Request._onResponseBase (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:739:22) 2023-09-01T14:14:14.545Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2023-09-01T14:14:14.546Z at async Request._onResponse (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:778:13) {"id":"5wCBWO7EdF4MfO2","url":"https://similarweb.com/website/jan3.com","method":"GET","uniqueKey":"https://similarweb.com/website/jan3.com"} 2023-09-01T14:14:15.188Z ERROR HttpCrawler: Request failed and reached maximum retries. HTTPError: Response code 403 (Forbidden) 2023-09-01T14:14:15.193Z at Request. (/usr/src/app/node_modules/got-cjs/dist/source/as-promise/index.js:91:42) 2023-09-01T14:14:15.194Z at Object.onceWrapper (node:events:628:26) 2023-09-01T14:14:15.195Z at Request.emit (node:events:525:35) 2023-09-01T14:14:15.196Z at Request._onResponseBase (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:739:22) 2023-09-01T14:14:15.197Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2023-09-01T14:14:15.197Z at async Request._onResponse (/usr/src/app/node_modules/got-cjs/dist/source/core/index.js:778:13) {"id":"TjWtUBjJNrER6WF","url":"https://similarweb.com/website/01.xyz","method":"GET","uniqueKey":"https://similarweb.com/website/01.xyz"} 2023-09-01T14:14:15.198Z INFO HttpCrawler: All requests from the queue have been processed, the crawler will shut down. 2023-09-01T14:14:15.558Z INFO HttpCrawler: Crawl finished. Final request statistics: {"requestsFinished":0,"requestsFailed":2,"retryHistogram":[null,null,null,2],"requestAvgFailedDurationMillis":1014,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":7,"requestTotalDurationMillis":2028,"requestsTotal":2,"crawlerRuntimeMillis":15054} 2023-09-01T14:14:15.559Z INFO HttpCrawler: Error analysis: {"totalErrors":2,"uniqueErrors":1,"mostCommonErrors":["2x: Response code 403 (Forbidden) (/usr/src/app/node_modules/got-cjs/dist/source/as-promise/index.js:91:42)"]}

mscraper avatar

Please use residential US proxy

EH

electric_harpsichord

a year ago

Should I always use the proxy from now on?

EH

electric_harpsichord

a year ago

I have attached the change.

mscraper avatar

Yes, you should (in that scraper). Is it working now?

pt., 1 wrz 2023 o 17:17 Daniel Jean < topic+uhhznnoc2sttr5np5d@reply.apify.com> napisał(a):

EH

electric_harpsichord

a year ago

Yes, it's working. I'm closing this issue.

Developer
Maintained by Community
Actor metrics
  • 34 monthly users
  • 8 stars
  • 99.1% runs succeeded
  • 1.7 days response time
  • Created in Jun 2023
  • Modified about 1 month ago