Goodreads Scraper
3 days trial then $15.00/month - No credit card required now
Goodreads Scraper
3 days trial then $15.00/month - No credit card required now
Scrape goodreads.com for data on millions of books. Crawl book details for images, ISBN, author, description, title, buy links, number of reviews, page number, language, and all other details. You can specify search terms, filters, and much more.
I am unable to extract all comments from a single book; I can only extract 29 comments from each book. For example, there are a total of 692 comments (https://console.apify.com/actors/sk1JsDmbderUw0J79/runs/J3Oo7X82rfZ6cONjN#output), I can extract only 29. Please assist me in extracting all the comments.
Hey,
I’ve created a ticket with our engineering team to investigate the issue. We will get back to you as soon as possible with a resolution.
Best
Hey again,
Thank you very much for reaching out, and letting us know about your inquiry. We just deployed a new version that resolves the current bug. To retrieve all the reviews properly, you should not use "End page" and "Max Items" properties and let the actor decide how much data it should retrieve.
Best
I am still facing the same issue. There are a total of 692 comments (https://console.apify.com/actors/sk1JsDmbderUw0J79/runs/0hWysa4Sb6eJSEydx#output), I can extract only 28. Please assist me in extracting all the comments.
Hello again,
We just checked your run (the link that you shared) and it seems like you used maxItems
and endPage
properties again which both are set to 1.
If you are using the UI, can you please remove these fields completely? You can find these numbers in the "Advanced Options" part with the names called "Maximum number of listing items" and "List end page". You can just remove the numbers from the input box, and proceed.
If you are using the API, you can just remove both of these fields and let the actor handle itself.
Please let us know how it goes. Best
I deleted the settings for the maximum number and end page. This time, the processing took longer, but when I downloaded the file, it still contained 29 comments. https://console.apify.com/actors/sk1JsDmbderUw0J79/runs/0hWysa4Sb6eJSEydx#output
Hey again,
There is a very big potential that your input is somehow cached in your browser and you are resending the same input. If you open the Run and click the "Input" tab below, you can see the maxItems
and endPage
values still exist. This could be a potential problem with Apify, your browser, or your PC. Can you please flush your cache if needed, or try the execution with a different browser? Unfortunately, this part is the territory of Apify and we do not have any control over it.
Best
Hello, I deleted maxItems and endPage. I’ve attempted to use both Google Chrome and Firefox, but I’m encountering the same issue. How can I download all comments? I would appreciate your assistance. https://console.apify.com/actors/sk1JsDmbderUw0J79/runs/Gsm7ZwnqAEdOxfcvX#output. here there are ~12000 comments
Hey again,
Thank you very much for the reply. It seems like this is a different issue where the file is too big for Apify to handle. We will add a new mode for CSV-friendly input and let you know as soon as possible.
Best
Hey again,
So the main problem here was the data size that Apify could not handle. Just because of this, we added a new attribute called csvFriendlyOutput
(CSV Friendly Output). When this option is enabled, you will retrieve the reviews in a flattened way with a big dataset. That attribute only works with reviews and it is the only solution that we found out till now. Also, please don't forget to remove the Custom Map Function and Extend Output Function. As an example usage:
https://console.apify.com/view/runs/QufchaSCkr5AxKCwE
Best
Hi, I have set up everything according to your instructions. Unfortunately, I wasn’t able to extract all the comments from the book One Hundred Years of Solitude (https://www.goodreads.com/book/show/320.One_Hundred_Years_of_Solitude). While there are approximately 50,000 comments, I only managed to retrieve 30. You can view the details here (https://console.apify.com/actors/runs/YzNpFOS6foKqTyIOJ#output ). Could you please assist me with this?
Thank you!
Hey, I passed this information to the Engineering team and will get back to you as soon as possible. Best
Hey there,
We just deployed a new version to resolve this issue. It should be good to go now!
Best
Hi I am still facing the same issue for one book. The book has ~52000 comments, I only can get ~25000 comments. https://console.apify.com/actors/runs/9RUBhaWfpiunSPArB#output
Hey,
I passed this information to the Engineering team and will get back to you as soon as possible.
Best
Hey there, We just deployed a new version to resolve this issue. You can get all (~52000) comments now. Best,
Actor Metrics
7 monthly users
-
4 stars
96% runs succeeded
16 hours response time
Created in Mar 2021
Modified a day ago