Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instagram scraped profile feeds don't include video mp4 URLs #123

Open
snarfed opened this issue Dec 7, 2017 · 2 comments
Open

instagram scraped profile feeds don't include video mp4 URLs #123

snarfed opened this issue Dec 7, 2017 · 2 comments

Comments

@snarfed
Copy link
Owner

snarfed commented Dec 7, 2017

...because instagram's embedded JSON in profile pages doesn't have the video_url field with the mp4 link. we'd need to fetch each video post permalink individually to get it.

totally doable, but i probably won't prioritize this until someone asks for it. just tracking for now. cc @aaronpk.

scraped news feeds (with session cookie) and individual permalinks do include the mp4 URLs.

@snarfed
Copy link
Owner Author

snarfed commented Dec 7, 2017

for comparison, here's a video node in the embedded JSON on https://www.instagram.com/thejohnnysmith/ , entry_data.ProfilePage[0].user.media.nodes[0]:

{
  "__typename": "GraphVideo",
  "id": "1663943750965504332",
  "comments_disabled": false,
  "dimensions": {
    "height": 750,
    "width": 750
  },
  "gating_info": null,
  "media_preview": "ACoqseeudoGece9RvbleTgA9v61pMiEiReD6ggA/X2rFurkzthTtUcZ9fc+3tWt+xFgaZEOOv05oW6jJ5Bx+FZztIrZbBB74q2sKtGJFIyc5U9seh70uYfKWWvFxlVwfc5H5cVGbpz6D2xVR3Cdsk+nT/PtSBZz0/nRcLD0uioMbHAPcc/p2p0Ww8B/wxj+fWmPHuHHBB/OoUjLcYz6dqi4y9dWwyFJIJA6emc9/eowmeMlQOABj8/xovhIjAMdwwMfl0qNpQeOoxwT+ePwpjJVjCduPbOf8aeLkY/8AriqkUpPyHj0NWTGhOT1/D/CkAjZxx1xTEYZycKQep7j+vrUbMQvXtTk+Yc849fpSQjRu4xJGOecja3qf8Md/aspSA2G4HGP8981pE/uh7Fv/AEGsQn+lUwLhhQ4I98il2L/z0x+FQA8f59KtKBgVLGf/2Q==",
  "owner": {
    "id": "654594"
  },
  "thumbnail_src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/s640x640/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
  "thumbnail_resources": [
    {
      "src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/s150x150/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
      "config_width": 150,
      "config_height": 150
    },
    "..."
  ],
  "is_video": true,
  "code": "BcXhAaKh6lM",
  "date": 1512577610,
  "display_src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
  "video_views": 3790,
  "caption": "Murica. Animated. #imagemanipulation #digitalart #animation #republicans",
  "comments": {
    "count": 10
  },
  "likes": {
    "count": 1024
  }
}

...and here's the same video node in the permalink, https://www.instagram.com/p/BcXhAaKh6lM/ , entry_data.PostPage[0].graphql.shortcode_media. so different it's not even worth diffing.

{
  "__typename": "GraphVideo",
  "id": "1663943750965504332",
  "shortcode": "BcXhAaKh6lM",
  "dimensions": {
    "height": 750,
    "width": 750
  },
  "gating_info": null,
  "media_preview": "ACoqseeudoGece9RvbleTgA9v61pMiEiReD6ggA/X2rFurkzthTtUcZ9fc+3tWt+xFgaZEOOv05oW6jJ5Bx+FZztIrZbBB74q2sKtGJFIyc5U9seh70uYfKWWvFxlVwfc5H5cVGbpz6D2xVR3Cdsk+nT/PtSBZz0/nRcLD0uioMbHAPcc/p2p0Ww8B/wxj+fWmPHuHHBB/OoUjLcYz6dqi4y9dWwyFJIJA6emc9/eowmeMlQOABj8/xovhIjAMdwwMfl0qNpQeOoxwT+ePwpjJVjCduPbOf8aeLkY/8AriqkUpPyHj0NWTGhOT1/D/CkAjZxx1xTEYZycKQep7j+vrUbMQvXtTk+Yc849fpSQjRu4xJGOecja3qf8Md/aspSA2G4HGP8981pE/uh7Fv/AEGsQn+lUwLhhQ4I98il2L/z0x+FQA8f59KtKBgVLGf/2Q==",
  "display_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
  "display_resources": [
    {
      "src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/s640x640/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
      "config_width": 640,
      "config_height": 640
    },
    "..."
  ],
  "dash_info": {
    "is_dash_eligible": false,
    "video_dash_manifest": null,
    "number_of_qualities": 0
  },
  "video_url": "https://instagram.frir1-1.fna.fbcdn.net/vp/9073fc67f141289491118de09540bb9f/5A2C4B4D/t50.2886-16/24222831_239068156630392_3793391278482259968_n.mp4",
  "video_view_count": 3798,
  "is_video": true,
  "should_log_client_event": false,
  "tracking_token": "eyJ2ZXJzaW9uIjo1LCJwYXlsb2FkIjp7ImlzX2FuYWx5dGljc190cmFja2VkIjp0cnVlLCJ1dWlkIjoiNWM4M2I4NDYwY2I0NDczNmIxYmY2MjgzMWRiMDk3YmExNjYzOTQzNzUwOTY1NTA0MzMyIn0sInNpZ25hdHVyZSI6IiJ9",
  "edge_media_to_tagged_user": {
    "edges": []
  },
  "edge_media_to_caption": {
    "edges": [
      {
        "node": {
          "text": "Murica. Animated. #imagemanipulation #digitalart #animation #republicans"
        }
      }
    ]
  },
  "caption_is_edited": true,
  "edge_media_to_comment": {
    "count": 10,
    "page_info": {
      "has_next_page": false,
      "end_cursor": null
    },
    "edges": [
      {
        "node": {
          "id": "17897454409126928",
          "text": "@marina...",
          "created_at": 1512577690,
          "owner": {
            "id": "1039315351",
            "profile_pic_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-19/s150x150/23970161_157151168230471_1728751850200498176_n.jpg",
            "username": "bysk..."
          }
        }
      },
      "..."
    ]
  },
  "comments_disabled": false,
  "taken_at_timestamp": 1512577610,
  "edge_media_preview_like": {
    "count": 1025,
    "edges": [
      {
        "node": {
          "id": "1370321247",
          "profile_pic_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-19/s150x150/12523811_801579049964891_270399277_a.jpg",
          "username": "pxn..."
        }
      },
      "..."
    ]
  },
  "edge_media_to_sponsor_user": {
    "edges": []
  },
  "location": null,
  "viewer_has_liked": false,
  "viewer_has_saved": false,
  "viewer_has_saved_to_collection": false,
  "owner": {
    "id": "654594",
    "profile_pic_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-19/s150x150/24175492_141781853142224_1588976035087515648_n.jpg",
    "username": "thejohnnysmith",
    "blocked_by_viewer": false,
    "followed_by_viewer": false,
    "full_name": "Johnny Smith",
    "has_blocked_viewer": false,
    "is_private": false,
    "is_unpublished": false,
    "is_verified": false,
    "requested_by_viewer": false
  },
  "is_ad": false,
  "edge_web_media_to_related_media": {
    "edges": []
  }
}

@JorgeCastilloPrz
Copy link

JorgeCastilloPrz commented Dec 31, 2018

Do you have a clue on how to request it with the info provided in the profile json? (the one you get with links like https://www.instagram.com/loganpaul/?__a=1). I can see it's not including the video url, so how to relate that one with the actual video permalink to download the actual video?

Ok found it. The video Id would be code or short_code on the item.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants