Welcome to my Reddit Repo

Here you'll find pyspark code to:

read and process reddit submission data
calculate frequencies of words from various collections select posts and subreddits

and Python code which uses t-SNE to plot the relationship between subreddits as described by word collection frequencies, as well as a powerpoint explaining t-SNE and the application of t-SNE to a few other domains.

This represents my (Kendra Chalkley's) course work and pet project from portions of my MS CS at Oregon Health and Science University's Center for Spoken Language Understanding. (It's important to mention this, because I'm currently looking for my first job since completing this degree and hopefully someone has made it this far as a result of my resume...)

Important citations for this work include:

files.pushshift.io which hosts compressed collections of reddit data, pre-harvested from the reddit API
In an Absolute State by Al-Mosaiwi and Johnstone was the initial inspiration for the project. Their absolutist dictionary is one of the word collections used throughout the project. The others are from LIWC collections.
t-SNE visualization was one of the most sucessful aspects of project, for a variety of timing reasons. I gave a presenation explaining the algorithm which is available in the tsne sub folder of this repo, but it lacks narrative, which is instead available from the author's Google Techtalk

Presentation notes and notebook improvements are the next anticipated updates to this folder, and will be delayed only by a competing need to write cover letters.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
Spark		Spark
TsneProject		TsneProject
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark

Spark

TsneProject

TsneProject

readme.md

readme.md

Repository files navigation

Welcome to my Reddit Repo

About

Releases

Packages

Languages

KChalk/RedditProject

Folders and files

Latest commit

History

Repository files navigation

Welcome to my Reddit Repo

About

Resources

Stars

Watchers

Forks

Languages