BigGreen at LCP-2021

This repository is the official implementation for my paper BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models.

Lexical complexity prediction is a fascinating problem to which anyone can practice applying modern machine learning techniques…that is, with a solid feature set and computational resources. In this documentation, you’ll learn how to extract a diverse set of features that I used in competition to attain near state-of-the-art results, train a model to make predictions using these features, fine-tune an alternative deep learning-based model that sacrifices model interpretability (…or does it? Section 6.3 of the paper investigates this) to produce predictions based on BERT-based embeddings, and finally, use a simple assembly technique to harness the predictive capabilities of both of our capable models.

The importance of proper computational resources for speeding up modeling cannot be understated, and I learned this firsthand through the course of this research. This was during the peak of the COVID-19 pandemic, when I didn’t have as much funding nor access to lab resources as extensively as would’ve been ideal. Nonetheless, we adapt! I distributed my work across a local machine (referred to as macbook here on out), a Google Colab environment (colab), and Dartmouth Research Computing’s Discovery Cluster (discovery). Now, I’m assuming you aren’t taking this convoluted route of navigating three different platforms, and are working largely on a single platform (ideally with multi-GPU access). Thus, I’ll explain my code as succinctly as possible so that you can leverage it within your setup.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
colab		colab
discovery		discovery
macbook		macbook
.gitignore		.gitignore
README.md		README.md
downloads.sh		downloads.sh
pipeline.ipynb		pipeline.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colab

colab

discovery

discovery

macbook

macbook

.gitignore

.gitignore

README.md

README.md

downloads.sh

downloads.sh

pipeline.ipynb

pipeline.ipynb

Repository files navigation

BigGreen at LCP-2021

About

Releases

Packages

Languages

Aadil101/BigGreen-at-LCP-2021

Folders and files

Latest commit

History

Repository files navigation

BigGreen at LCP-2021

About

Resources

Stars

Watchers

Forks

Languages