Skip to content

PANDIARAJAN-V/twitter-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

#Indroduction: Today, data is scattered everywhere in the world. Especially in social media, there may be a big quantity of data on Facebook, Instagram, Youtube, Twitter, etc. This consists of pictures and films on Youtube and Instagram as compared to Facebook and Twitter.and I got the real facts on Twitter, I scrape the data from Twitter. the data like (date, id, url, tweet content, user,reply count, retweet count,language, source, like count etc) from twitter

SNSCRAPE: scraping twitter using "snscrape" library in python.snscrape is a scraper for social network services(sns).it scraping like user id,hashtags,searches,from date and likes

Requiremnts: snscrape requires python 3.8 or higher.the python ackage dependencies are installed automattically when you insall snscrape.

    pip install snscrape

Install the necessary libraries:

Snscrape: to scrape Twitter data. Pandas: to create dataframes. Pymongo: to interact with MongoDB. Streamlit: to create a GUI. Create a function to scrape Twitter data using snscrape. The function should take the following inputs:

Keyword or hashtag to search for. Start and end date range. Number of tweets to scrape. The function should return a list of dictionaries, where each dictionary represents a tweet and contains the following fields:

Date ID URL Tweet content User Reply count Retweet count Language Source Like count Create a function to insert the scraped data into MongoDB. The function should take the following inputs:

Collection name List of scraped tweets The function should insert each tweet as a document into the specified collection.

Create a function to download the scraped data in CSV and JSON format. The function should take the following inputs:

Collection name Output file path The function should retrieve all the documents from the specified collection and write them to the output file in the desired format.

Create a Streamlit app with the following features:

Text input for the keyword or hashtag to search for. Date range input using Streamlit's DateInput widget. Number input for the number of tweets to scrape. Button to start the scraping process. Table to display the scraped data. Button to upload the data to MongoDB. Button to download the data in CSV and JSON format. When the user clicks on the "Scrape" button, call the scraping function with the specified inputs and display the scraped data in a table.

When the user clicks on the "Upload" button, call the insert function with the collection name and scraped data.

When the user clicks on the "Download CSV" or "Download JSON" button, call the download function with the collection name and output file path

Releases

No releases published

Packages

No packages published

Languages