Skip to content

This project is a portfolio of my work on data scraping from Reddit using Python. Read the README file to learn more.

Notifications You must be signed in to change notification settings

AbdulDevHub/Reddit-Data-Scrapping

Repository files navigation

Reddit Data Scrapping With Python

Welcome to my repository! This project is a portfolio of my work on data scraping from Reddit using Python.

Project Overview

This project involves scraping data from Reddit using the Python Reddit API Wrapper (PRAW), performing data analysis, and visualizing the results. The repository contains 2 Jupyter notebooks with all the utilized code, a PDF and PowerPoint of the findings, and a folder with the produced images/figures.

Dependencies

This project uses the following Python libraries:

  • praw: For interacting with the Reddit API.
  • pandas: For data manipulation and analysis.
  • datetime: For working with dates and times.
  • nltk: For natural language processing tasks.
  • gensim: For topic modelling and document similarity analysis.
  • matplotlib: For creating static, animated, and interactive visualizations in Python.
  • pyLDAvis: For interactive topic model visualization.
  • wordcloud: For creating word cloud images.

Getting Started

To run the notebook, you can either extract the code to a separate Python file, or just run these two files directly on Jupyter Notebook.

You'll also need to set up PRAW with your own Reddit API credentials. You can do this in the notebooks where the PRAW instance is initialized.

Contact

If you have any questions or feedback, feel free to open an issue or submit a pull request.

Enjoy exploring the repository!


About

This project is a portfolio of my work on data scraping from Reddit using Python. Read the README file to learn more.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published