Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Tor		Tor
output		output
readme_media		readme_media
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
scraper.py		scraper.py

Repository files navigation

ig-profie-scraper

Fetch and save real time data anonymously from any Instagram profile without using official API.

Table of Content

Prerequisites
Installation
Features
License

Prerequisites

Before you continue, ensure you have met the following requirements.

You are using a Linux or Windows OS Machine.
You have installed latest version of Python, Firefox and Geckodriver.
You have installed and running latest version of Tor listening on SOCKSPort 9050.
You have installed xvfb (only for linux).

Installation and Setup

You can get step by step detailed Installation steps here for both windows and linux.

Git clone or Download this project and run below command in project directory.
```
pip install -r requirements.txt
```
Open up config.py in your favourite text editor and
- Replace timezone according to your country or state.
```
TIMEZONE = timezone("Asia/Kolkata")
```
- Add your temporary insta ids in ids dictonary.
```
ids = {
    "<USERNAME_OR_EMAIL_HERE>" : "<PASSWORD_HERE>",
    "<USERNAME_OR_EMAIL_HERE>" : "<PASSWORD_HERE>"
}
```
- Add usernames of profiles which you want to scrape in the list of usernames.
```
usernames = ["<USERNAME1>", "<USERNAME2>"]
```
- Add your Slack webhook URL to get notified about errors and exceptions while running this scraper.
```
slack = Slack(url = "<<ADD_YOUR_SLACK_WEBHOOK_URL_HERE>>")
```

Congratulations! you are ready to go, now run scraper.py . Ping me if you ever face any kind of error.

Features

Profile Scraping
- Full Name and Biography (Both encoded with utf-8)
- Followers and Following
- Number of public posts and owned media
- Is user's account private, business, verified, has channel, joined recently
- Profile page ID
- Conneced FB page
- Externel URL
Save data to an unique csv file in output folder.
Check for existing csv file and will create a new file if old one dosen't exist.
Random sleep time (to create a little randomness).
Autologin and auto logout (to switch ids after every 8 hours).
Automatic browser screenshots in ss_log/browser folder.
Slack webhook Integration to get error notifications
Tor connectivity and public ip check

License

Project License can be found here

MIT © Rahul Meena

About

Fetch and save real-time data anonymously from any Instagram profile without using official API.

python instagram crawler scraper osint csv scraping crawling tor python3 instagram-scraper beautifulsoup4 instagram-profile webmining osint-python

Report repository

Languages

Python 100.0%