Code for the experiments in: "More than just Frequency? Demasking Unsupervised Hypernymy Prediction Methods." Thomas Bott, Dominik Schlechtweg and Sabine Schulte im Walde. Accepted to appear in Findings of ACL 2021
- pickle (to save and load files)
- docopt (for command line interface)
- tqdm (for processing bars)
- sklearn (for classification)
- graphviz (for drawing trees)
- numpy
- tabulate (for creating tables)
They can be installed by running pip install -r requirements.txt
in the terminal. Pickle should already be contained in the python standard library.
The scripts should be run with python3. Each script contains a usage pattern, which indicates how to use it. Further information about the arguments and options can be obtained with the -h (--help) option. For successfull usage, it is recommended to execute the scripts in the following order:
dsm_creation.py
dsm_combine.py
rowSums.py
read_dataset.py
filter_dataset.py
freqDiff_freqBias.py
plmi.py
weedsPrec.py
invCL.py
slqsRow.py
slqs.py
evaluation.py
classification.py
https://wacky.sslmit.unibo.it/doku.php
Gertrud Faaß and Kerstin Eckart. 2013. SdeWaC – A Corpus of Parsable Sentences from the Web. In Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology, pages 61–68, Darmstadt, Germany.
Adriano Ferraresi, Eros Zanchetta, Marco Baroni, and Silvia Bernardini. In- troducing and evaluating ukwac, a very large web-derived corpus of english. In Proceedings of the 4th Web as Corpus Workshop (WAC-4) Can we beat Google, pages 47–54, 2008
https://uni-tuebingen.de/en/142806
Birgit Hamp and Helmut Feldweg. Germanet - a lexical-semantic net for german. In Automatic information extraction and building of lexical semantic resources for NLP applications, pages 9–15, 1997
Christiane Fellbaum. Wordnet. The encyclopedia of applied linguistics, 1998
Can be obtained from: https://github.com/vered1986/UnsupervisedHypernymy
Marco Baroni and Alessandro Lenci. How we blessed distributional semantic evaluation. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, pages 1–10. Association for Computational Linguistics, 2011
Enrico Santus, Frances Yung, Alessandro Lenci, and Chu-Ren Huang. Evalution 1.0: an evolving semantic dataset for training and evaluation of distributional semantic models. In Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications, pages 64–69, 2015
Alessandro Lenci and Giulia Benotto. Identifying hypernyms in distribu- tional semantic spaces. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 75–79, 2012
Julie Weeds, Daoud Clarke, Jeremy Reffin, David Weir, and Bill Keller. Learning to distinguish hypernyms and co-hyponyms. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 2249–2259. Dublin City University and Association for Computational Linguistics, 2014