Welcome to my repository! This project is a portfolio of my work on data scraping from Reddit using Python.
This project involves scraping data from Reddit using the Python Reddit API Wrapper (PRAW), performing data analysis, and visualizing the results. The repository contains 2 Jupyter notebooks with all the utilized code, a PDF and PowerPoint of the findings, and a folder with the produced images/figures.
This project uses the following Python libraries:
praw
: For interacting with the Reddit API.pandas
: For data manipulation and analysis.datetime
: For working with dates and times.nltk
: For natural language processing tasks.gensim
: For topic modelling and document similarity analysis.matplotlib
: For creating static, animated, and interactive visualizations in Python.pyLDAvis
: For interactive topic model visualization.wordcloud
: For creating word cloud images.
To run the notebook, you can either extract the code to a separate Python file, or just run these two files directly on Jupyter Notebook.
You'll also need to set up PRAW with your own Reddit API credentials. You can do this in the notebooks where the PRAW instance is initialized.
If you have any questions or feedback, feel free to open an issue or submit a pull request.
Enjoy exploring the repository!