Skip to content

manikandan-ravikiran/complex-assignments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Complex Assignments

Binder

Introduction

This repository includes codes related to our paper Finding Black Cat in a Coal Cellar - Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments focusing empirical study of Keyphrase Extraction & Generic/Specific Keyphrase-Rubric Relationship Classification.

Dependencies

  • Python3 (Tested on python 3.7)
  • Jupyter Notebook
  • Scikit-learn
  • Transformers(Hugging Face)
  • pke
  • eli5

Installation

First, clone the repository:

https://github.com/manikandan-ravikiran/complex-assignments.git

Install requirements.

pip install -r requirements.txt
pip install git+https://github.com/boudinfl/pke.git
python -m nltk.downloader stopwords
python -m nltk.downloader universal_tagset
python -m spacy download en # download the english model
pip install spacy
pip install en-core-web-sm

NOTE: The, datasets are already processsed, features are extracted and pickled for replication purposes. Due to privacy restrictions we dont release any datasets in raw format. If you need data for research purposes. Please send an email to mravikiran3@gatech.edu along with details on your research.

The codes are in form of Ipython notebook, you can deploy directly in binder and execute. Please click on the binder build icon. (Please note the due to requirement of GPU and privacy of datasets binder can run only few of the experiments of RQ2.2. For full fledged run, use an independent GPU machine)

Code Organization

Result Reproducibility & Execution

Results from Paper Code Folder
Table 4 (KEA/WINGUS) Execute this
Table 4 (KPMINER/YAKE) Execute this
Table 4 (Ranking) Execute this
Table 4 (KEA) Execute this
Table 4 (Multipartite) Execute this
Table 7 (K-Means) Execute this
Table 8 (Agglomerative) Execute this
Table 9 (Spectral) Execute this
Table 10 (Latent Dirichlet Allocation) Execute this
Table 12 (BOW/TF-IDF) Execute this
Table 12 (Language Models) Execute this
Table 14 (Interpretability - BERT) Execute this
Table 15 (Interpretability - SVM+TFIDF) Execute this

Cite

If you find this repo useful in your research, please consider citing the following papers:

@article{Ravikiran2020FindingBC,
  title={Finding Black Cat in a Coal Cellar - Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments},
  author={Manikandan Ravikiran},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.01549}
}

@article{Ravikiran2020KeyPC,
title={Key Phrase Classification in Complex Assignments},
author={Manikandan Ravikiran},
journal={ArXiv},
year={2020},
volume={abs/2003.07019}
}