Pricing

Pay per usage

Go to Store

Fast Instagram Profile Posts Scraper 🚀

Try for free

Developed by

Instagram Scraper

Instagram Post Scraper. Post data: hashtags, comment_count, like_count, usertags, images, videos, shortcode and etc. Scrape Instagram Posts with Ease and Speed

3.8 (3)

Pricing

Pay per usage

Total users

190

Monthly users

Runs succeeded

97%

Last modified

2 months ago

Social media

Developer tools

Back to issues Create new issue

FEATURE REQUEST / View count

Closed

Labsed opened this issue

Hey,

Is it possible to add view_count to the post item?

Best

Labsed

And also the video duration.

Instagram Scraper (instagram-scraper)

view_count has been added, but it looks like it's basically null, and I need to make sure that the video duration is available via the video_dash_manifest info.

Labsed

Thanks for considering the feedback. Attaching my crawler in case it can help you with the manifest manipulation, or any other means.

import {
  createHttpRouter,
  HttpCrawler,
  HttpCrawlerOptions,
  HttpCrawlingContext,
  ProxyConfiguration,
} from "crawlee";
import jp from "jsonpath";
import * as cheerio from "cheerio";
import Video from "../../entities/video";
import Channel from "../../entities/channel";
import ChannelStats from "../../entities/channel-stats";
import VideoStats from "../../entities/video-stats";
import { CrawlerInterface, Platform } from "../../types";

export class InstagramCrawler
  extends HttpCrawler<any>
  implements CrawlerInterface
{
  constructor(options?: HttpCrawlerOptions) {
    const router = createHttpRouter();

    router.addDefaultHandler(async ({ request, json }: HttpCrawlingContext) => {
      const nodes: Record<string, any>[] = jp.query(
        json,
        "$.data.user.edge_owner_to_timeline_media.edges[?(@.node.__typename=='GraphVideo')].node",
      );

      for (const node of nodes) {
        const video = Video.create({
          channel: request.userData.channel,
          videoId: node.shortcode,
          title: node.edge_media_to_caption?.edges[0].node.text ?? "",
          duration: extractDurationFromManifest(
            node.dash_info.video_dash_manifest,
          ),
          publishedAt: new Date(node.taken_at_timestamp * 1000),
        });

        await Video.upsert(video, ["channel", "videoId"]);

        await VideoStats.save({
          video,
          views: node.video_view_count,
          comments: node.edge_media_to_comment.count,
          reactions: node.edge_media_preview_like.count,
          topReactions: [
            { name: "Like", count: node.edge_media_preview_like.count },
          ],
        });
      }
    });

    router.addHandler(
      "stats",
      async ({ json, request }: HttpCrawlingContext) => {
        const data: Record<string, any> = jp.value(
          json,
          "$..data.xdt_shortcode_media",
        );

        await Video.update(
          { id: request.userData.video.id },
          {
            title: data.edge_media_to_caption?.edges[0].node.text ?? "",
            duration: parseInt(data.video_duration),
            publishedAt: new Date(data.taken_at_timestamp * 1000),
          },
        );

        await VideoStats.save({
          video: request.userData.video,
          views: data.video_view_count,
          comments: data.edge_media_to_parent_comment.count,
          reactions: data.edge_media_preview_like.count,
          topReactions: [
            { name: "Like", count: data.edge_media_preview_like.count },
          ],
        });
      },
    );

    router.addHandler("cadd", async ({ log, json }: HttpCrawlingContext) => {
      const data: Record<string, any> = jp.value(json, "$..data.user");

      log.info("Got channel data", data);

      const channel = Channel.create({
        platform: Platform.INSTAGRAM,
        channelId: data.id,
        username: data.username,
        name: data.full_name,
      });

      await Channel.upsert(channel, ["platform", "channelId"]);

      log.info("Channel saved", channel);
    });

    router.addHandler(
      "cstats",
      async ({ request, json }: HttpCrawlingContext) => {
        await ChannelStats.save({
          channel: request.userData.channel,
          followers: json.data.user.edge_followed_by.count,
        });
      },
    );

    let proxyConfiguration: ProxyConfiguration | undefined;

    if (process.env.IG_PROXY_URL) {
      proxyConfiguration = new ProxyConfiguration({
        proxyUrls: [process.env.IG_PROXY_URL],
      });
    }

    super({
      ...options,
      proxyConfiguration,
      requestHandler: router,
      preNavigationHooks: [
        async (_: HttpCrawlingContext, gotOptions: any) => {
          gotOptions.headers = {
            "X-IG-App-ID": "936619743392459",
          };
        },
      ],
    });
  }

  addChannelAddRequest(url: string) {
    return this.addRequests([{ url, label: "cadd" }]);
  }

  addChannelStatsRequests(channels: Channel[]) {
    return this.addRequests(
      channels.map((channel) => ({
        url: `https://www.instagram.com/api/v1/users/web_profile_info/?username=${channel.username}`,
        label: "cstats",
        userData: { channel },
      })),
    );
  }

  addChannelVideoRequests(channels: Channel[]) {
    return this.addRequests(
      channels.map((channel) => ({
        url: `https://www.instagram.com/api/v1/users/web_profile_info/?username=${channel.username}`,
        userData: { channel },
      })),
    );
  }

  addVideoStatsRequests(videos: Video[]) {
    return this.addRequests(
      videos.map((video) => ({
        url: "https://www.instagram.com/graphql/query",
        method: "POST",
        headers: { "Content-Type": "application/x-www-form-urlencoded" },
        payload: new URLSearchParams({
          doc_id: "8845758582119845",
          variables: JSON.stringify({ shortcode: video.videoId }),
        }).toString(),
        useExtendedUniqueKey: true,
        label: "stats",
        userData: { video },
      })),
    );
  }
}

function extractDurationFromManifest(manifestXml: string): number {
  const match = cheerio
    .load(manifestXml, { xmlMode: true })("MPD")
    .attr("mediaPresentationDuration")
    ?.match(/^PT([\d.]+)S$/);

  return match?.[1] ? parseInt(match[1]) : 0;
}

Instagram Scraper (instagram-scraper)

Thanks a lot, I found the video_duration parameter, I grabbed the post collection and the parameter was not as comprehensive as the single page information

Add comment

Instagram Posts Scraper

lazyscraper/instagram-posts-scraper

Instagram Profile & Post Analytics Scraper: Extract Public Data with Engagement Metrics

Lazy Scraper

Instagram Post Details Scraper 📸

powerful_bachelor/instagram-post-details-scraper

📸 Extract detailed Instagram post data effortlessly! Get likes, comments, captions, hashtags & more. Perfect for 📊 analytics, 🎯 marketing research & 📈 performance tracking. Easy setup, powerful results! Supports multiple formats: JSON, CSV, Excel. Start scraping Instagram posts today! 🚀✨

Powerful Bachelor

2.0

Instagram User Posts details Scraper Pro

getdataforme/instagram-user-posts-details-scraper-pro

We've built an Instagram user post scraper that helps you fetch user posts along with key insights like comment count, image, like count, and post URL. Use it to track influencer posts, analyze engagement, and gain valuable Instagram analytics. Instagram user post scraper

GetDataForMe

1.0

Instagram Posts Scraper

pratikdani/instagram-posts-scraper

The Instagram Post Scraper extracts detailed data from any valid post URL, including content, engagement metrics, user details, and media. It delivers structured output for social media analysis, monitoring, and research.

Pratik Dani

161

Instagram User's Post Scraper

devil_port369-owner/instagram-posts-scraper

Instagram Post Scraper lets you extract public posts from any Instagram profile. Simply input usernames and download posts in formats like Excel, CSV, or JSON for data analysis, reports, and visualizations. Ideal for market research, audience engagement, AI training, and sentiment analysis.

DataFusionX

Instagram Posts Scraper

easyapi/instagram-posts-scraper

Effortlessly scrape Instagram posts data from multiple users. Extract comprehensive details including likes, comments, captions, hashtags, and more. Perfect for social media analysis, market research, and content monitoring.