Detecting Sarcasm in Reddit Comments

This was a small project I worked on, with Rubini and Vikram, during my 2020 Summer Internship at Carnegie Mellon University.

Aim

The aim is to detect sarcasm in comments found on Reddit, using the Sarcasm on Reddit dataset available from Kaggle. Through this, we also aim to identify features that are indicative of sarcasm, and explain our models' predictions.

Methodology and Results

We experimented with TF-IDF and BERT Sentence Embeddings to extract features from text. We tried using various combinations of features, such as using only the comment, its characteristics and also its parent comment, to provide context. Additionally, we tried to use PCA for dimensionality reduction.

The classifiers we used include the Random Forest Classifier, Gradient Boosting Classifier and the Multi-Layer Perceptron, among others.

Our best-performing model was a Random Forest Classifier trained on TF-IDF features extracted from raw text (comment and parent) and also the comment's characteristics such as the subreddit and author. It obtained an F1-Score of 0.66 on the validation set. The comment's characteristics were deemed as very important features by the models we built.

Code

The code is available as three Jupyter Notebook files, simply start up a Jupyter Notebook server and run the code. Ensure that the dependencies are installed before you run the code. To do so, simply execute this command in the Terminal:

pip install -r requirements.txt

Presentation

Our presentation is also available in this repository, and provides more information.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
Sarcasm.pdf		Sarcasm.pdf
SarcasmBERT.ipynb		SarcasmBERT.ipynb
SarcasmEDAPreprocessing.ipynb		SarcasmEDAPreprocessing.ipynb
SarcasmTFIDF.ipynb		SarcasmTFIDF.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sarcasm.pdf

Sarcasm.pdf

SarcasmBERT.ipynb

SarcasmBERT.ipynb

SarcasmEDAPreprocessing.ipynb

SarcasmEDAPreprocessing.ipynb

SarcasmTFIDF.ipynb

SarcasmTFIDF.ipynb

requirements.txt

requirements.txt

Repository files navigation

Detecting Sarcasm in Reddit Comments

Aim

Methodology and Results

Code

Presentation

About

Releases

Packages

Languages

nandahkrishna/SarcasmDetection

Folders and files

Latest commit

History

Repository files navigation

Detecting Sarcasm in Reddit Comments

Aim

Methodology and Results

Code

Presentation

About

Topics

Resources

Stars

Watchers

Forks

Languages