Skip to content

ayoola-babatunde/flairs

Repository files navigation

reddit flairs

User flairs on r/soccer


IN PROGRESS


Introduction

Questions to answer

Challenges

Introduction

I am doing some data mining, exploratory analysis, and data visualization of user flairs on the subreddit, r/soccer. Flairs are little badges a user can have beside their username to show which team they support.

Flairs on r/soccer

Flairs are a way of communicating to other Reddit users where you stand, giving context to your comment. It removes the need for each individual to type "I support x team, btw" or "As a x fan, ..."

Using PRAW, The Python Reddit API Wrapper, I extracted information about posts and comments and answered questions I was curious about.

1% rule of internet culture: Only about 1 percent of users on a website create new content. For example, there are 1.8 million users subscribed to r/soccer, but an average post only has 500 upvotes and top posts of the month have 35000 upvotes. The results of this analysis apply only to commenters on the subreddit.

Questions to answer

  • General exploration

    • Does everyone have a flair?

      I sampled a random 5 percent of the top 5000 posts for the month on r/soccer. For each post, I recorded all the comments and their users flairs and found that 28 percent of users have not chosen a flair. That is, about 7 out of 10 individuals have chosen a team or country to represent them. The data is available in a csv file here.
    • Which flairs get the most upvotes?
    • Which flairs are the most plentiful?
  • Hypothesis testing questions

    • Is there a cyclic pattern to flair frequency? Can we see teams grow and lose popularity?
    • Is there a 'siloing' effect where only similar users comment on a post?

Challenges

  • Producing useful data visualizations and good writing, instead of just jupyter notebooks
  • Using SQL to manage database

Releases

No releases published

Packages

No packages published