Grammatical Profiling for Semantic Change Detection

This is the repository for the CoNLL-2021 paper "Grammatical Profiling for Semantic Change Detection" by Mario Giulianelli, Andrey Kutuzov, and Lidia Pivovarova.

The full_profiling_pipeline.sh script can be used right out of the box.

Run the script as

./full_profiling_pipeline.sh TARGETS CORPUS0 CORPUS1

TARGETS is a list of target words, one per line
CORPUS0 and CORPUS1 are your earlier and later corpora, tagged and parsed into the CoNLL-U format. Any tagger and parser of your choice can be used (UDPipe and Stanza come to mind immediately).

The script will process your CONLL-U files and generate morphosyntax-based predictions of graded and binary semantic change for each target word. Find the aggregated change scores as predicted by our algorithm in the output directory. The script already uses the best hyperparameters from the paper.

Do not forget to first install dependencies:

pip3 install -r requirements.txt

In case you are interested, the output/tsv directory contains separate predictions from morphological and syntax features.

More details

stanza_process.py produces parsed CONLL files from raw texts using Stanza;
collect_ling_stats.py reads a CONLL file and dumps frequencies for morphological and syntax properties of the target words into JSON files;
compare_ling.py produces TSV files with binary and graded change predictions from two JSON files;
merge.py reads two JSON files (e.g. with morphological and syntax properties) and produces combined predictions.

Use eval.py to evaluate the resulting TSVs with regards to gold scores.

The raw profiles (feature counts) from the paper are in the /features directory.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
features		features
scripts		scripts
LICENSE		LICENSE
README.md		README.md
category_selection.py		category_selection.py
collect_ling_stats.py		collect_ling_stats.py
collect_synt_stats_reverse.py		collect_synt_stats_reverse.py
compare_ling.py		compare_ling.py
eval.py		eval.py
example.py		example.py
full_profiling_pipeline.sh		full_profiling_pipeline.sh
helpers.py		helpers.py
merge.py		merge.py
requirements.txt		requirements.txt
stanza_process.py		stanza_process.py

License

glnmario/semchange-profiling

Folders and files

Latest commit

History

Repository files navigation

Grammatical Profiling for Semantic Change Detection

More details

About

Resources

License

Stars

Watchers

Forks

Languages