Web Scraper avatar

Web Scraper

Try for free

No credit card required

Go to Store
Web Scraper

Web Scraper

apify/web-scraper
Try for free

No credit card required

Crawls arbitrary websites using the Chrome browser and extracts data from pages using JavaScript code. The Actor supports both recursive crawling and lists of URLs and automatically manages concurrency for maximum performance. This is Apify's basic tool for web crawling and scraping.

Do you want to learn more about this Actor?

Get a demo
B2

Is there a way to scrape article in website by keyword?

Closed

BSD_24 opened this issue
3 months ago

Hi, i was wondering, is there a way to scrape article in website with certain keywords? For example in one initial url i want to scrape all article with keyword "Football". Can this actor do it? Thanks!

jindrich.bar avatar

Hello and thank you for your interest in this Actor (and sorry for the delay).

I'm not sure I fully understand your use case, but I'll try to share some general tips:

  • In case you want to scrape a single website (e.g., a blog with sports articles) but only want to store results concerning football, you can definitely do it with this Actor.

    • If the blog e.g. separates the articles by topics, you can only crawl pages with URLs with /topics/football using the Glob Patterns input option.
    • If the blog doesn't do this, you can still crawl all the articles on the website, but only store the ones containing football (you can do a simple substring search in the Page Function and return undefined if you don't find football).
  • If your question regards scraping the Internet for articles with the Football keyword in them, then, no, this Actor is probably not your best bet.

Imagine Apify Actors as a way of repeating whatever you would do on the Internet (just faster and without so much manual work). You can definitely click all the links on one blog and pick out only pages about football. On the other hand, going through the entire Internet and looking for football articles might be impossible, just due to the sheer size of the problem. You would also need to use e.g. Google Search to find the articles, which adds another layer of complexity.

Does this answer your question? Let us know if you have any additional questions.

Cheers!

B2

BSD_24

a month ago

Thanks for the infromation!

Developer
Maintained by Apify

Actor Metrics

  • 2.6k monthly users

  • 340 stars

  • >99% runs succeeded

  • 37 days response time

  • Created in Mar 2019

  • Modified 5 months ago

Categories