Shein Scraper avatar

Shein Scraper

Try for free

2 days trial then $30.00/month - No credit card required now

View all Actors
Shein Scraper

Shein Scraper

natanielsantos/shein-scraper
Try for free

2 days trial then $30.00/month - No credit card required now

Scrape product data from Shein with this reliable tool. Extract price, images, description, sizes, category, shipping price and much more. Download your data as HTML table, JSON, CSV, Excel, XML, and RSS feed.

GT

Difficulty Retrieving Data from Paginated Pages through Shein Scapper

Closed

gtecom opened this issue
a month ago

I am currently working on implementing a "load more content" feature for my application using data from Shein. For example, I am trying to fetch data from the following URL for page 1:

https://www.shein.co.uk/recommend/HOODIES-and-SWEATSHIRTS-sc-100157275.html?adp=26064685&categoryJump=true&ici=uk_tab00navbar07menu01dir06&src_identifier=fc%3DAll%60sc%3DMen%20Clothing%60tc%3DShop%20By%20Category%60oc%3DHoodies%20%26%20Sweatshirts%60ps%3Dtab00navbar07menu01dir06%60jc%3DitemPicking_100157275&src_module=topcat&src_tab_page_id=page_home1723520188127&page=1

However, when attempting to fetch data for page 2 using the following URL, no data is returned:

https://www.shein.co.uk/recommend/HOODIES-and-SWEATSHIRTS-sc-100157275.html?adp=26064685&categoryJump=true&ici=uk_tab00navbar07menu01dir06&src_identifier=fc%3DAll%60sc%3DMen%20Clothing%60tc%3DShop%20By%20Category%60oc%3DHoodies%20%26%20Sweatshirts%60ps%3Dtab00navbar07menu01dir06%60jc%3DitemPicking_100157275&src_module=topcat&src_tab_page_id=page_home1723520188127&page=2

Could you please advise on how to resolve this issue or suggest a better approach to implement pagination?

Additionally, I have integrated Apify into my Node.js application using the Apify client. I have two concerns:

Performance: The data retrieval process takes a considerable amount of time. Is there a way to speed up the response time?

Data Filtering: Currently, I am unable to configure Apify to fetch only the specific fields I need, rather than all available fields. Can you provide guidance on how to optimize this and reduce unnecessary data retrieval?

GT

gtecom

a month ago

Hey ! It has been a week since I posted this query, yet I have not received any response. It would be greatly appreciated if you could provide an answer to the above query.

natanielsantos avatar

Hi. Thanks for opening this issue. I managed to fix the bug while trying to scrape the second page. Regarding the performance, I managed to improve it significantly, but It's still kinda slow. I will keep trying to improve it, but SheIn improved their anti-scraping recently so it's very challenging.

Here's how you can specify the fields you want using the Apify Client:

1const client = new ApifyClient({
2    token: "MY-APIFY-TOKEN",
3});
4
5// Starts an actor and waits for it to finish.
6const { defaultDatasetId } = await client
7    .actor("natanielsantos/shein-scraper")
8    .call();
9
10// Fetches results from the actor's dataset.
11const { items } = await client
12    .dataset(defaultDatasetId)
13    .listItems({
14        fields: ["main_image", "product_id", "sku", "url", "title", "images"],
15    });

May I know what data you need ? Maybe I can add an option in the input that will make the actor scrape only the essential data. Thanks.

GT

gtecom

a month ago

I can now scrape the data from the remaining pages and the filtering process is functioning properly. The fields I am receiving are satisfactory and I am successfully refining them to obtain what I need. I am grateful for your assistance. it was incredibly beneficial.

natanielsantos avatar

I'll close this issue. If you face another issue, please open another one or contact me through email if you want. Thank you.

Developer
Maintained by Community
Actor metrics
  • 33 monthly users
  • 4 stars
  • 68.9% runs succeeded
  • 2.3 days response time
  • Created in Apr 2023
  • Modified 4 days ago
Categories