Upwork Scraper Without Stale Job Posts avatar
Upwork Scraper Without Stale Job Posts

Pricing

$9.99/month + usage

Go to Store
Upwork Scraper Without Stale Job Posts

Upwork Scraper Without Stale Job Posts

Developed by

Artur

Maintained by Community

Comes without any stale data like many other scrapers do. Low cost and efficient. Can filter out duplicates and posts older than 24h

5.0 (1)

Pricing

$9.99/month + usage

15

Monthly users

45

Runs succeeded

95%

Response time

3.8 days

Last modified

19 days ago

OT

Error? Not working reliably

Closed

OtisB opened this issue
a month ago

Hey, I have been using this for some time now without issues but now it's not scraping anything and I'm getting errors. See the log below for a reference:

2025-03-12T20:25:27.775Z ACTOR: Pulling Docker image of build hGjMuUPHnmr28zfFg from repository. 2025-03-12T20:25:31.878Z ACTOR: Creating Docker container. 2025-03-12T20:25:31.990Z ACTOR: Starting Docker container. 2025-03-12T20:25:33.223Z INFO System info {"apifyVersion":"3.2.6","apifyClientVersion":"2.9.7","crawleeVersion":"3.11.5","osType":"Linux","nodeVersion":"v16.20.2"} 2025-03-12T20:25:34.084Z INFO CheerioCrawler: Starting the crawler. 2025-03-12T20:25:35.454Z WARN CheerioCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 403 status code. 2025-03-12T20:25:35.457Z {"id":"t6O46v6KvQF2Kw0","url":"https://www.upwork.com/nx/search/jobs/?amount=500-&contractor_tier=1,2,3&from_recent_search=true&hourly_rate=20-&payment_verified=1&q=framer%20AND%20%22framer%22&sort=recency&t=0,1","retryCount":1} 2025-03-12T20:25:36.967Z WARN CheerioCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 403 status code. 2025-03-12T20:25:36.969Z {"id":"t6O46v6KvQF2Kw0","url":"https://www.upwork.com/nx/search/jobs/?amount=500-&contractor_tier=1,2,3&from_recent_search=true&hourly_rate=20-&payment_verified=1&q=framer%20AND%20%22framer%22&sort=recency&t=0,1","retryCount":2} 2025-03-12T20:25:41.503Z WARN CheerioCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 403 status code. 2025-03-12T20:25:41.505Z {"id":"t6O46v6KvQF2Kw0","url":"https://www.upwork.com/nx/search/jobs/?amount=500-&contractor_tier=1,2,3&from_recent_search=true&hourly_rate=20-&payment_verified=1&q=framer%20AND%20%22framer%22&sort=recency&t=0,1","retryCount":3} 2025-03-12T20:25:43.468Z ERROR CheerioCrawler: Request failed and reached maximum retries. Error: Request blocked - received 403 status code. 2025-03-12T20:25:43.470Z at CheerioCrawler._throwOnBlockedRequest (/usr/src/app/node_modules/@crawlee/basic/internals/basic-crawler.js:702:19) 2025-03-12T20:25:43.472Z at CheerioCrawler._runRequestHandler (/usr/src/app/node_modules/@crawlee/http/internals/http-crawler.js:333:22) 2025-03-12T20:25:43.475Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2025-03-12T20:25:43.477Z at async CheerioCrawler._runRequestHandler (/usr/src/app/node_modules/@crawlee/cheerio/internals/cheerio-crawler.js:148:9) 2025-03-12T20:25:43.479Z at async wrap (/usr/src/app/node_modules/@apify/timeout/cjs/index.cjs:54:21) {"id":"t6O46v6KvQF2Kw0","url":"https://www.upwork.com/nx/search/jobs/?amount=500-&contractor_tier=1,2,3&from_recent_search=true&hourly_rate=20-&payment_verified=1&q=framer%20AND%20%22framer%22&sort=recency&t=0,1","method":"GET","uniqueKey":"https://www.upwork.com/nx/search/jobs?amount=500-&contractor_tier=1%2C2%2C3&from_recent_search=true&hourly_rate=20-&payment_verified=1&q=framer+AND+%22framer%22&sort=recency&t=0%2C1"} 2025-03-12T20:25:43.481Z WARN CheerioCrawler: The 'error' property of the crawling context is deprecated, and it is now passed as the second parameter in 'errorHandler' and 'failedRequestHandler'. Please update your code, as this property will be removed in a future version. 2025-03-12T20:25:43.581Z INFO CheerioCrawler: All requests from the queue have been processed, the crawler will shut down. 2025-03-12T20:25:43.658Z INFO CheerioCrawler: All requests from the queue have been processed, the crawler will shut down. 2025-03-12T20:25:43.719Z INFO CheerioCrawler: Final request statistics: {"requestsFinished":0,"requestsFailed":1,"retryHistogram":[null,null,null,1],"requestAvgFailedDurationMillis":1698,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":6,"requestTotalDurationMillis":1698,"requestsTotal":1,"crawlerRuntimeMillis":9926} 2025-03-12T20:25:43.721Z INFO CheerioCrawler: Error analysis: {"totalErrors":1,"uniqueErrors":1,"mostCommonErrors":["1x: Request blocked - received 403 status code. (/usr/src/app/node_modules/@crawlee/basic/internals/basic-crawler.js:702:19)"]} 2025-03-12T20:25:43.724Z INFO CheerioCrawler: Finished! Total 1 requests: 0 succeeded, 1 failed. {"terminal":true}

OT

OtisB

a month ago

Hmm.. well that's annoying haha. What is wrong exactly? I am in the US, so i figured I would use that proxy code, but, it's not consistent. What does the proxy even mean/do? And do i just need to switch proxies all the time until one sticks? Is there anything I can to do make this more reliable? Removing the string "from_recent_search=true" seems like a feeble workaround.. I need something that is reliable.

arlusm avatar

Artur (arlusm)

a month ago

I assume upwork has improved their anti-scraping detection so now more requests get blocked than before. Proxy is just an IP address, some ip addresses get blocked as bots, others don't. At the moment switching proxies and adjusting request URLs to make them more likely to pass any bot detection is the only workaround. I'm working on a solution to bypass the improved anti-scraping detection, but sadly it probably won't be a quick fix.

OT

OtisB

a month ago

Ahh I see. Yeah when I go to Upwork, I notice a cloud flare test Everytime. Definitely looks like they have increased their bot protection. That sucks for us haha!

Well thanks anyway my man. I appreciate the swift replies

arlusm avatar

Artur (arlusm)

a month ago

The thing is in the past they had their own job notification system via RSS feeds, so you didn't even need to use scrapers like this and it worked 100% of the time. But for some reason they removed it last year and haven't provided any alternatives, and now they're making it worse again for people who don't want to refresh the main page all day. I wouldn't be surprised if at some point they come out with the 'new' job feeds straight to email feature, for which you'll have to subscribe to Freelancer Plus or spends connects lol. Anyways yeah, i'll try to figure out some way to bypass their increased detection, if I find something solid i'll let you know.

CB

caring_bear

a month ago

Artur, thx a lot for your help! Please keep us updated if you see any viable solution. It would help decide where to go from this point. Best,

LR

larrythegarry6

a month ago

This was working perfectly since recently, unfortunatelly upwork added cloudflare and all scrapers here stopped working, for everyone looking for a solution at least untill other devs here adopt to upwork changes, I tested all scrapers, only one working atm is https://apify.com/neatrat/upwork-job-scraper . Hope thats helpful.

Pricing

Pricing model

Rental 

To use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.

Free trial

2 hours

Price

$9.99