Website Content Crawler avatar
Website Content Crawler

Pricing

Pay per usage

Go to Store
Website Content Crawler

Website Content Crawler

Developed by

Apify

Maintained by Apify

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

4.6 (38)

Pricing

Pay per usage

1199

Monthly users

6.3k

Runs succeeded

>99%

Response time

3.5 days

Last modified

2 days ago

SN

issue with crawling html table data

Closed

sprouto_net opened this issue
a month ago

I have a webpage at https://www.robofy.ai/ai-chatbot-pricing here Pricing that contains pricing information in a table format. When extracting the content in Markdown format and sending it to the LLM for the AI chatbot, the HTML table is not converting properly to Markdown and is instead returning plain text. As a result, the chatbot's responses are not accurate. Could you please advise on how to resolve this issue?

jakub.kopecky avatar

Hi, thank you for using Website Content Crawler.

Please try setting the HTML processing -> HTML transformer to None to see if LLM can handle this, or check Output settings -> Save HTML to key-value store and extract the table from the raw HTML.

Let me know if that helps,

Jakub

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.