Website Content Crawler
No credit card required
Website Content Crawler
No credit card required
Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗LangChain, LlamaIndex, and the wider LLM ecosystem.
Do you want to learn more about this Actor?
Get a demoapify_client._errors.ApifyApiError: You must rent a paid Actor in order to run it.
Opened 18 hours ago by dresma_apify, last comment 12 hours ago by Jiří Spilka (jiri.spilka)
Are they any settings that help out with cookie prompts?
Opened 7 days ago by pumpkin_protractor, last comment 6 days ago by Oscar Rodriguez (Oscardz)
Hi my run didnt work
Opened 10 days ago by ballerine, last comment 7 days ago by ballerine
request for html output to keep only essential <img> outputs
Opened 10 days ago by confident_socket, last comment 10 days ago by Oscar Rodriguez (Oscardz)
Data not being pushed to Pinecone from WCC
Opened 12 days ago by Custombizio, last comment 10 days ago by Oscar Rodriguez (Oscardz)
My run doesn't enqueue any URLs
Opened 12 days ago by bhupeshchandra, last comment 12 days ago by bhupeshchandra
Limiting scraped pages (e.g. maxCrawlPages = 30) doesn't work
Opened 14 days ago by beaming_gauge, last comment 14 days ago by Jan Buchar (janbuchar)
Crawling subdomains
Opened 22 days ago by gainful_governor, last comment 22 days ago by gainful_governor
Correct Json body (API)
Opened 24 days ago by glovebubble, last comment 24 days ago by glovebubble
Zapier trigger = time out
Opened a month ago by teal_northerner, last comment 9 days ago by Oscar Rodriguez (Oscardz)
Crawler accesses pages and loads data correctly, but status code is 404
Opened a month ago by cirez_d, last comment a month ago by cirez_d
Not able to download any pages or files
Opened a month ago by ollieiq, last comment a month ago by Jindřich Bär (jindrich.bar)
Screenshot got cut off
Opened a month ago by harry_tran, last comment a month ago by harry_tran
Crawler fails crawling nike.com
Opened a month ago by ballerine, last comment a month ago by Jindřich Bär (jindrich.bar)
Download RSS Feeds
Opened a month ago by carlson, last comment a month ago by Jindřich Bär (jindrich.bar)
Configuring Crawler Settings for Crawling Image URLs
Opened a month ago by glovebubble, last comment a month ago by Jindřich Bär (jindrich.bar)
.json file issues via API
Opened a month ago by glovebubble, last comment a month ago by Jan Buchar (janbuchar)
how to Crawl behind login?
Opened a month ago by Visualife, last comment a month ago by Jan Buchar (janbuchar)
Unable to get any data from certain websites
Opened 2 months ago by ollieiq, last comment a month ago by ollieiq
Requests are failing often
Opened 2 months ago by cirez_d, last comment a month ago by cirez_d
- 3k monthly users
- 465 stars
- 99.9% runs succeeded
- 3.1 days response time
- Created in Mar 2023
- Modified 10 days ago