Apify Youtube Comment Scraper avatar
Apify Youtube Comment Scraper

Under maintenance

Pricing

Pay per usage

Go to Store
Apify Youtube Comment Scraper

Apify Youtube Comment Scraper

Under maintenance

Developed by

Akash Kumar Naik

Akash Kumar Naik

Maintained by Community

0.0 (0)

Pricing

Pay per usage

0

Total users

3

Monthly users

3

Runs succeeded

>99%

Last modified

10 days ago

YouTube Comment Scraper

A robust Apify actor designed to extract comprehensive comment data from YouTube videos with advanced features and error handling.

Features

  • Multi-URL Support: Process multiple YouTube videos in a single run
  • Advanced Comment Data: Extract author info, timestamps, likes, replies, and more
  • Video Metadata: Get video title, channel info, view count, and description
  • Comment Sorting: Choose between top comments or newest first
  • Error Handling: Graceful handling of disabled comments, age restrictions, and network issues
  • Performance Optimized: Smart scrolling and request management
  • Input Validation: Comprehensive URL validation and parameter sanitization

Input Parameters

Required

  • Start URLs: Array of YouTube video URLs to scrape comments from

Optional

  • Max Comments per Video: Maximum number of comments to scrape (default: 0 for all, no upper limit)
  • Sort By: How to sort comments - "top" or "newest" (default: "top")
  • Include Replies: Reserved for future functionality (default: false)
  • Wait Time: Milliseconds to wait between scrolls (default: 2000ms)

Output Data

Each comment is returned as a separate item with the following fields:

Video Information

  • videoUrl: Original video URL
  • videoTitle: Video title
  • videoDescription: Video description (truncated to 500 chars)
  • channelName: Channel name
  • channelUrl: Channel URL
  • viewCount: View count string
  • uploadDate: Upload date string

Comment Information

  • author: Comment author name
  • authorChannelUrl: Author's channel URL
  • text: Comment text content
  • textLength: Length of comment text
  • postedAt: When the comment was posted
  • likes: Number of likes on the comment
  • replyCount: Number of replies to the comment
  • isPinned: Whether the comment is pinned
  • isChannelOwner: Whether the comment is from the channel owner
  • position: Position in the comment list (1-based)
  • scrapedAt: ISO timestamp when the comment was scraped

Error Handling

The actor handles various error conditions gracefully:

  • Invalid URLs: Skips invalid YouTube URLs with warnings
  • Disabled Comments: Reports when comments are disabled
  • Age Restrictions: Automatically handles age verification prompts
  • Network Issues: Implements retry mechanisms
  • Rate Limiting: Uses single concurrency to avoid rate limits

Usage Examples

Basic Usage

{
"startUrls": [
{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"}
],
"maxComments": 50
}

Advanced Usage

{
"startUrls": [
{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"},
{"url": "https://youtu.be/oHg5SJYRHA0"}
],
"maxComments": 200,
"sortBy": "newest",
"waitTime": 3000
}

Supported URL Formats

  • https://www.youtube.com/watch?v=VIDEO_ID
  • https://youtube.com/watch?v=VIDEO_ID
  • https://youtu.be/VIDEO_ID
  • https://m.youtube.com/watch?v=VIDEO_ID

Performance Notes

  • Memory: Recommended 4GB RAM for optimal performance
  • Timeout: Allow sufficient time for large comment counts (up to 500)
  • Rate Limiting: The actor processes one video at a time to avoid YouTube rate limits
  • Scrolling: Automatic intelligent scrolling to load all available comments

Troubleshooting

No Comments Found

  • Video may have comments disabled
  • Video may be age-restricted
  • Video may be private or removed

Incomplete Data

  • Some fields may be "Unknown" if not available on the page
  • YouTube's dynamic loading may require longer wait times

Rate Limiting

  • The actor is designed to respect YouTube's rate limits
  • If you encounter issues, try reducing the number of concurrent runs

Technical Details

  • Browser: Uses Playwright with Chrome browser
  • Concurrency: Single request processing to avoid rate limits
  • Retries: Up to 3 retries for failed requests
  • Timeout: 10-minute timeout per video (for up to 500 comments)
  • Memory: Optimized memory usage with streaming data processing

Deployment

Local Development

  1. Install dependencies: npm install
  2. Run locally: npm start

Apify Platform

  1. Create new actor on Apify Console
  2. Upload these files: main.js, package.json, INPUT_SCHEMA.json, actor.json
  3. Build and run the actor

License

Apache-2.0 License

Support

For issues, feature requests, or questions, please contact the developer or check the actor's documentation on the Apify platform.