Booking Scraper avatar

Booking Scraper

Try for free

Pay $5.00 for 1,000 results

View all Actors
Booking Scraper

Booking Scraper

voyager/booking-scraper
Try for free

Pay $5.00 for 1,000 results

Scrape Booking with this hotels scraper and get data about accommodation on Booking.com. You can crawl by keywords or URLs for hotel prices, ratings, addresses, number of reviews, stars. You can also download all that room and hotel data from Booking.com with a few clicks: CSV, JSON, HTML, and Excel

Do you want to learn more about this Actor?

Get a demo
AH

Crawler not returning full result set

Closed

amazed_hydrometer opened this issue
4 months ago

Hi Team, The result set is meant to return 14000 results as per the URL, instead, it always stops at 10405 with a Cherio error.

Any help in resolving the issue would be greatly appreciated. Thank you

lukas.prusa avatar

Hi mk1, thanks for opening this issue!

  1. The result count. Booking has a hard limit of 1000 results which we just can't simply overcome. The way we currently get around it, is by sub-diving the location into it's sub-locations displayed on the page. In your case, Australia was split up into Queensland, New South Wales etc. This method allows us to scrape much more than the 1000 results limit, but it doesn't mean we can get all the results. Alternatively, if you search with check-in and check-out dates, we can use price filtering to get around the limit, and with that we are able to get all the results.

  2. The “cheerio” error is an internal error for us. We simply retry the request in such cases to resolve the issues. But for some reason, Booking has gone crazy with displaying the wrong language and currency 20 times in a row for 4 places in your search, which caused them to hit our retry limit. We will investigate this. Anyway, these are the places that failed, if you want to re-scrape them:

I hope this helps, thanks!

Also, we have some issues currently with users not being able to comment/reopen on closed issues, so I will keep this open for now and feel free to close this if everything has been resolved :) Thanks and happy scraping!

AH

amazed_hydrometer

3 months ago

Dear Lukáš Průša, Thank you promptly to you and your team for getting back to me in a timely manner.

The usecase is to get all the locations in Australia with a swimming pool from booking.com (14112 results [https://www.booking.com/searchresults.en-gb.html?label=gen173nr-1BCAEoggI46AdIM1gEaA-IAQGYAQm4ARfIAQzYAQHoAQGIAgGoAgO4Ari03LQGwAIB0gIkMzZlZTdkNmItOTMyOS00ZGNlLWFkNDgtODZlZDYyMzJjMmY52AIF4AIB&sid=afb02f67542b2b632a179fca5d89800a&aid=304142&ss=Australia&ssne=Australia&ssne_untouched=Australia&efdco=1&lang=en-gb&src=searchresults&dest_id=13&dest_type=country&group_adults=1&no_rooms=1&group_children=0&nflt=hotelfacility%3D433])

I already ran the actor twice and both times it errored out at the same place which cost me extra as I couldn't resurrect. With that being said, I am just looking for a way to extract the remaining dataset as there are 14112 records, but now it keeps erroring out at 10405.

Can you please advise how I can get the remaining data please? Your help is greatly appreciated. Thank you once again

lukas.prusa avatar

Thanks for providing us with additional information!

Hmm, it's interesting that it failed twice extracting the same locations. I've tested them out as standalone URLs and that worked, though they used up almost all of their retries, so this is something we will investigate for sure. Although, I'm a little scared that this could be just some exception on Booking itself, blocking these hotels more.

Anyway, as I mentioned before, Booking has a hard limit of a 1000 results per search, though we are able to get 1100 due to a bug in their system. Still, the limit is there, and the only way to get around it is to split your search into smaller ones less than 1100 results. Since you are not using check-in and check-out dates, we are not able to use price ranges to limit the results automatically and have to rely on much less effective methods.

Because of that, the best you can do is manually pick some mutually exclusive filters that will get you below 1100 results. E.g. picking “property type” filter like in the attached screenshot. You can then reduce the results by “property rating” filter, then you can try the “review score” one. And lastly, the one for “top destinations in Australia” we use. Still, it might not work to get all the results, and it will be annoying to do. Also be careful of using the “sort by” methods on Booking, because most of them are A/B tested and return the results in a different order each time, even though they say otherwise (this only happens to anonymous users, which is how the scraper operates).

I can also recommend you the Fast Booking Scraper which only extracts the hotels from the search page, and doesn't enqueue them one by one for their detail page. This makes it much faster and as such cheaper.

I hope this helps, let me know if this works for you, thanks!

lukas.prusa avatar

Hi, I'm closing this issue now due to inactivity. If you believe that this is still a problem that we can fix, please feel free to reopen this issue, thanks!

AH

amazed_hydrometer

3 months ago

Hi Lukas, Thank you for getting back to me. I tried the Fast Booking Scraper and the data stopped at the exact same number of records (10584).

Can you please provide the reason why if possible? (run id: XmfJlbV3ZTbcPN8vr) Thank you. Please provide if need any further information

lukas.prusa avatar

Hi, thanks for your patience, sorry I missed this due to this issue being closed. Anyway, as I mentioned a few times already, the full issue is that booking has a hard limit of 1100 results per query. We are doing all we can to get around it, but there is just no way to do it consistently and efficiently in our scraper.

Developer
Maintained by Apify
Actor metrics
  • 154 monthly users
  • 32 stars
  • 99.2% runs succeeded
  • 10 days response time
  • Created in Aug 2023
  • Modified 1 day ago
Categories