This repository contains the code used for the paper. "Linguistic Features for Readability Assessment" (Deutsch, Jasbi, and Shieber 2020). The "han" folder contains code from the (Hierarchical-attention-networks-pytorch repository)[https://github.com/uvipen/Hierarchical-attention-networks-pytorch]. The "bert" folder is based on code from Huggingface and modified by the authors of "Supervised and unsupervised neural approaches to text readability" (Martinc, Pollak, and Robnik-Šikonja 2020). Classes in the "runner.py" are forked from Brain Yu.
The code used for generating the linguistic features themselves can be found at https://bitbucket.org/tovly/complexity-features-ling-features/src/master/ and is based on code by Sowmya Vajjala Balakrishna.
The corpora used are not available publicly and are thus not included in this repository. They can be obtained by contacting the original corpus creators.