X Crawler avatar

X Crawler

Try for free

3 days trial then $3.00/month - No credit card required now

View all Actors
X Crawler

X Crawler

lumen_limitless/x-crawler
Try for free

3 days trial then $3.00/month - No credit card required now

This project is a web scraper designed to extract user data and tweets from X (formerly known as Twitter) using Crawlee and Playwright.

X (Twitter) Crawler

This project is a web scraper designed to extract user data and tweets from X (formerly known as Twitter) using Crawlee and Playwright.

Purpose

The main purpose of this scraper is to:

  1. Navigate to a specified X user profile
  2. Extract user information
  3. Collect 100 most liked tweets from the user's timeline

This tool can be useful for various applications, such as:

  • Social media analysis
  • User behavior research
  • Content aggregation
  • Sentiment analysis

Features

  • Utilizes Playwright for browser automation
  • Implements Crawlee for efficient web crawling
  • Extracts user profile data
  • Collects recent tweets from the user's timeline
  • Handles X's dynamic content loading

Usage

  1. Set the target user's profile URL in the startUrls array in the input configuration.
  2. Adjust the maxRequestsPerCrawl value to limit the number of requests if needed.
  3. Run the scraper to collect data.

Output

The scraper outputs two main types of data:

  1. User information
  2. Recent tweets from the user's timeline

The collected data is structured and can be easily processed for further analysis or integration into other systems.

Output Example

Here's an example of the structured output you can expect from this scraper:

User Object

1"user": {
2"__typename": "User",
3"id": "VXNlcjo0NDE5NjM5Nw==",
4"rest_id": "44196397",
5"affiliates_highlighted_label": {
6"label": {
7"url": {
8"url": "https://twitter.com/X",
9"urlType": "DeepLink"
10},
11"badge": {
12"url": "https://pbs.twimg.com/profile_images/1683899100922511378/5lY42eHs_bigger.jpg"
13},
14"description": "X",
15"userLabelType": "BusinessLabel",
16"userLabelDisplayType": "Badge"
17}
18},
19"is_blue_verified": true,
20"profile_image_shape": "Circle",
21"legacy": {
22"created_at": "Tue Jun 02 20:12:29 +0000 2009",
23"default_profile": false,
24"default_profile_image": false,
25"description": "",
26"entities": {
27"description": {
28"urls": []
29}
30},
31"fast_followers_count": 0,
32"favourites_count": 60807,
33"followers_count": 189827332,
34"friends_count": 662,
35"has_custom_timelines": true,
36"is_translator": false,
37"listed_count": 152087,
38"location": "",
39"media_count": 2308,
40"name": "Elon Musk",
41"normal_followers_count": 189827332,
42"pinned_tweet_ids_str": [
43"1813310196506349995"
44],
45"possibly_sensitive": false,
46"profile_banner_url": "https://pbs.twimg.com/profile_banners/44196397/1690621312",
47"profile_image_url_https": "https://pbs.twimg.com/profile_images/1780044485541699584/p78MCn3B_normal.jpg",
48"profile_interstitial_type": "",
49"screen_name": "elonmusk",
50"statuses_count": 47242,
51"translator_type": "none",
52"verified": false,
53"withheld_in_countries": []
54},
55"professional": {
56"rest_id": "1679729435447275522",
57"professional_type": "Creator",
58"category": []
59},
60"tipjar_settings": {
61"is_enabled": false,
62"bandcamp_handle": "",
63"bitcoin_handle": "",
64"cash_app_handle": "",
65"ethereum_handle": "",
66"gofundme_handle": "",
67"patreon_handle": "",
68"pay_pal_handle": "",
69"venmo_handle": ""
70},
71"legacy_extended_profile": {},
72"is_profile_translatable": false,
73"has_hidden_subscriptions_on_profile": false,
74"verification_info": {
75"is_identity_verified": false,
76"reason": {
77"description": {
78"text": "This account is verified because it's an affiliate of @X on X. Learn more",
79"entities": [
80{
81"from_index": 54,
82"to_index": 56,
83"ref": {
84"url": "https://twitter.com/X",
85"url_type": "ExternalUrl"
86}
87},
88{
89"from_index": 63,
90"to_index": 73,
91"ref": {
92"url": "https://help.twitter.com/en/rules-and-policies/profile-labels",
93"url_type": "ExternalUrl"
94}
95}
96]
97},
98"verified_since_msec": "-156836000000000",
99"override_verified_year": -3000
100}
101},
102"highlights_info": {
103"can_highlight_tweets": true,
104"highlighted_tweets": "265"
105},
106"user_seed_tweet_count": 0,
107"business_account": {},
108"creator_subscriptions_count": 151
109},

Tweet Object

1{
2"__typename": "Tweet",
3"rest_id": "1519480761749016577",
4"unmention_data": {},
5"is_translatable": false,
6"views": {
7"state": "Enabled"
8},
9"source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
10"legacy": {
11"bookmark_count": 21256,
12"bookmarked": false,
13"created_at": "Thu Apr 28 00:56:58 +0000 2022",
14"conversation_id_str": "1519480761749016577",
15"display_text_range": [
160,
1752
18],
19"entities": {
20"hashtags": [],
21"symbols": [],
22"timestamps": [],
23"urls": [],
24"user_mentions": []
25},
26"favorite_count": 4468299,
27"favorited": false,
28"full_text": "Next I’m buying Coca-Cola to put the cocaine back in",
29"is_quote_status": false,
30"lang": "en",
31"quote_count": 166677,
32"reply_count": 182762,
33"retweet_count": 625073,
34"retweeted": false,
35"user_id_str": "44196397",
36"id_str": "1519480761749016577"
37},
38"quick_promote_eligibility": {
39"eligibility": "IneligibleUserUnauthorized"
40}
41},

Note

Please ensure you comply with X's terms of service and respect rate limits when using this scraper.

Developer
Maintained by Community
Actor metrics
  • 10 monthly users
  • 1 star
  • 100.0% runs succeeded
  • Created in Jul 2024
  • Modified about 1 month ago