Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

Go to Store
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo
RU

Request for Assistance: Actor Timeout Issue & Custom API Output

Open

RaulDC opened this issue
3 days ago

Dear Support,

I encountered an issue where the Actor timed out, and I am unable to continue my task. Could you please advise how to extend the timeout or assist in resuming the process from where it left off?

Additionally, is there a way to configure the crawler to return a predefined content in a page content HTML.

instead of an empty response when no data is found during an API request? This would be essential to prevent my automation in Zapier from failing.

Thank you for your support.

Best regards, Raul

RU

RaulDC

3 days ago

a predefined content in a page content HTML, such as:

jiri.spilka avatar

Hi, thank you for using Website Content Crawler.

I'm not very familiar with Zapier, but based on the logs, it seems there is a 30-second timeout. This is most likely a setting in Zapier. Could you please check? By default, the crawler uses a timeout of 360,000 seconds.

  1. Apify API: Run Actor Synchronously:
    In Apify, there is an endpoint to Run Actor synchronously with input and get dataset items:
    https://api.apify.com/v2/acts/:actorId/run-sync-get-dataset-items
    You can specify the timeout as a query parameter here.

  2. Keep HTML Elements:
    There is an option to Keep HTML elements using a CSS selector. Please refer to the documentation and set up a selector for your specific content.

I hope this helps. Please let me know whether you are able to solve it in Zapier. Jiri

Developer
Maintained by Apify

Actor Metrics

  • 4.1k monthly users

  • 854 stars

  • >99% runs succeeded

  • 24 hours response time

  • Created in Mar 2023

  • Modified 16 hours ago