Website Content Crawler avatar

Website Content Crawler

Try for free

No credit card required

View all Actors
Website Content Crawler

Website Content Crawler

apify/website-content-crawler
Try for free

No credit card required

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗LangChain, LlamaIndex, and the wider LLM ecosystem.

Do you want to learn more about this Actor?

Get a demo
CU

Data not being pushed to Pinecone from WCC

Closed

Custombizio opened this issue
12 days ago

After the crawl is successful. The WCC data is not getting pushed to Pinecode. I setup the WCC and Pinecode integration and selected to auto push to Pinecode. Getting erros and No data in Pinecode.

janbuchar avatar

Hello, we investigated this issue and it seems like it's an Apify console issue and we're working on fixing it. Thank you for reporting it!

Oscardz avatar

The fix for this issue was deployed today. Let us know if you have any other problems.

CU

Custombizio

10 days ago

This is what I see under run. But when I check Pinecone it shows zero records.

CU

Custombizio

10 days ago

This is what pinecone shows.

CU

Custombizio

10 days ago

Hello, Still on data in pinecone. See my comments I added. Looks like its not triggering.

CU

Custombizio

10 days ago

No Data.

Oscardz avatar

Hello, the problem is that Pinecone saves all the text in the Metadata, so this can happen when the content is big. It's recommended to chunk the data. You should change 'performChunking' to true in the Pinecone integration, which will fix the issue.

Developer
Maintained by Apify
Actor metrics
  • 3k monthly users
  • 465 stars
  • 99.9% runs succeeded
  • 3.1 days response time
  • Created in Mar 2023
  • Modified 10 days ago