Y Combinator Scraper | With Emails | $4.5 / 1K avatar
Y Combinator Scraper | With Emails | $4.5 / 1K

Pricing

$4.49 / 1,000 companies

Go to Apify Store
Y Combinator Scraper | With Emails | $4.5 / 1K

Y Combinator Scraper | With Emails | $4.5 / 1K

Developed by

Fatih Tahta

Fatih Tahta

Maintained by Community

Scrape the entire Y Combinator directory with verified founder emails. This high-speed scraper delivers rich company and founder data from any YC list (e.g., by batch or industry). Get a clean, structured dataset of the world's top startups for deal sourcing and lead generation.

5.0 (2)

Pricing

$4.49 / 1,000 companies

4

19

13

Last modified

11 days ago

Slug: fatihtahta/y-combinator-directory-scraper Price: $4.50 per 1,000 scraped companies

The definitive tool for extracting comprehensive company and founder data from the Y Combinator directory. This actor scrapes detailed company profiles, founder information, and then enriches this data by finding and verifying founder email addresses.

Whether you're sourcing deals, generating leads, or conducting market research, this scraper provides a clean, structured dataset of the world's most promising startups, ready for your workflow.


πŸš€ Features

  • 🎯 Scrape Any YC View: Target the entire directory or provide a pre-filtered YC URL (e.g., by batch, industry, or status) to get a highly specific dataset.
  • πŸ“§ Built-in Email Enrichment: Automatically finds and verifies professional email addresses for founders. The results are tagged as "verified" or "risky" (for catch-all domains) so you can prioritize your outreach.
  • ⚑️ Optimized for Speed & Reliability: The tool is optimized for both comprehensive discovery and efficient data collection.
  • πŸ“Š Rich Company & Founder Data: Extracts a deep dataset for each company, including their batch, status, description, social links, team size, and detailed information for up to four founders.
  • ♾️ Full Pagination Support: The actor automatically scrolls through the entire company list to discover every startup within your target view, stopping only when your maxCompanies limit is reached or the list ends.
  • πŸ“‹ Ready-to-Use Output: Download your data in clean JSON, CSV, Excel, or HTML, perfect for importing into your CRM or any analysis tool.

πŸ“Š What Data Can I Extract?

This Y Combinator Scraper extracts the following data points for each company:

Company Details

  • 🏒 Company Name, Logo URL & Location
  • πŸ”— YC Page URL & Company Website
  • πŸ”— LinkedIn & Twitter URLs
  • πŸ“ Tagline & Long Description
  • 🏷️ Industry Tags (e.g., "B2B", "Fintech")
  • πŸš€ YC Batch (e.g., "W24") & Status (Active, Public, etc.)
  • πŸ“… Year Founded & Team Size
  • πŸ’Ό Hiring Status & Number of Open Jobs

Founder Details (for up to 4 founders)

  • πŸ§‘β€πŸ’Ό Full Name & Founder ID
  • πŸ”— LinkedIn & Twitter URLs
  • πŸ“§ Email Address (if enrichment is enabled)
  • βœ… Email Status ("verified" or "risky")

πŸ’‘ Why Scrape Y Combinator?

The YC Directory is a goldmine of information on high-growth startups. Scraping this data provides invaluable insights for:

  • Investment & Deal Sourcing: Identify promising startups that match your investment thesis by filtering on industry, batch, and company status.
  • Sales & Lead Generation: Build targeted lead lists of potential B2B customers based on their industry, size, and recent funding.
  • Market Research: Analyze trends across YC batches, identify emerging industries, and track the growth of innovative companies.
  • Recruitment: Discover fast-growing startups that are actively hiring for key roles.
  • Academic Analysis: Study the characteristics of successful startups and the evolution of the YC ecosystem.

πŸ“– How to Use This Scraper

  1. Create a free Apify account if you don't have one.
  2. Open the Y Combinator Scraper and click "Try actor".
  3. Configure Your Scrape in the Input tab:
    • Provide a YC Directory URL. Use the default or a filtered one.
    • Set the Maximum companies to scrape.
    • Enable Find founder email addresses if you need emails.
  4. Click "Start" and wait for the data to be extracted.
  5. Download your data from the Storage tab.

πŸ“₯ Input Configuration

  • startUrls (string, required): The YC directory URL to start scraping from. You can use a filtered URL from your browser to target specific company types.
  • maxCompanies (number, optional): The maximum number of companies you want to find. If left empty, it will scrape all available companies.
  • enrichWithEmail (boolean, default: false): If true, the scraper will find founder emails.
  • includeRiskyEmails (boolean, default: true): If true, the scraper will include emails that are marked as "risky" and might bounce. Only applies when email enrichment is enabled.

πŸ“¦ Input and Output Examples

Example Input

This example will scrape up to 500 active B2B companies and find their founders' emails.

{
"startUrls": "[https://www.ycombinator.com/companies?industry=B2B&status=Active](https://www.ycombinator.com/companies?industry=B2B&status=Active)",
"maxCompanies": 500,
"enrichWithEmail": true,
"includeRiskyEmails": true
}

Example Output Dataset Item

{
"company_name": "Innovate AI",
"company_image": "[https://bookface-images.s3.amazonaws.com/small_logos/company_logo.png](https://bookface-images.s3.amazonaws.com/small_logos/company_logo.png)",
"company_location": "San Francisco, CA",
"batch": "W25",
"status": "Active",
"short_description": "AI-powered analytics for enterprise teams.",
"long_description": "Innovate AI provides a cutting-edge platform that leverages machine learning to automate data analysis and provide actionable insights for B2B companies.",
"website": "[https://www.innovateai.com](https://www.innovateai.com)",
"company_linkedin": "[https://www.linkedin.com/company/innovate-ai](https://www.linkedin.com/company/innovate-ai)",
"company_x": "[https://twitter.com/innovateai](https://twitter.com/innovateai)",
"year_founded": 2025,
"team_size": 8,
"is_hiring": true,
"tags": [
"B2B",
"AI and Machine Learning",
"Analytics"
],
"founders/0/name": "Jane Doe",
"founders/0/linkedin": "[https://www.linkedin.com/in/janedoe](https://www.linkedin.com/in/janedoe)",
"founders/0/x": "[https://twitter.com/janedoe](https://twitter.com/janedoe)",
"founders/0/email": "jane.doe@innovateai.com",
"founders/0/email_status": "verified",
"founders/1/name": "John Smith",
"founders/1/linkedin": "[https://www.linkedin.com/in/johnsmith](https://www.linkedin.com/in/johnsmith)",
"founders/1/x": "[https://twitter.com/johnsmith](https://twitter.com/johnsmith)",
"founders/1/email": "john@innovateai.com",
"founders/1/email_status": "risky"
}

πŸ’° How Much Will It Cost?

The actor is priced at $4.50 per 1,000 successfully scraped companies.

All infrastructure costs (proxies, compute) are bundled into this price. You only pay for the results you get. For example, scraping 5,000 company profiles will cost (5,000 / 1,000) * $4.50 = $22.50.

This scraper is designed to be ethical and only extracts publicly available data from the website. However, you should be aware that your results will contain personal data (e.g., founder names and emails). Personal data is protected by regulations like GDPR, and you should not scrape it unless you have a legitimate reason to do so. If you are unsure, it's best to consult with a legal professional.

❓ Support

If you encounter any problems or have suggestions for improvement, please open an issue in the Issues tab of the actor page on the Apify Console.

Happy Scraping! Fatih