GitHub - phueb/UnMasked: Score masked language models on grammar test suites

Score masked language models on grammatical knowledge test suites.

Test Suites

BLiMP
Zorro

Models

BabyBERTa
RoBERTa-base

Scoring Methods

holistic scoring (i.e. sum of cross-entropy error at every token)
MLM-scoring (i.e. sum of pseudo-log-likelihoods)

MLM-scoring was proposed by Salazar et al., 2019. This method computes pseudo-log-likelihoods and requires masking each word in the input one-at-a-time. In contrast, the holistic scoring proposed by Zaczynska et al., 2020 procedure does not use mask symbols, and instead computes the sum of the cross-entropy errors for every token in the input.

These two methods produce very different results. The holistic method favors models trained without predicting unmasked tokens, and handicaps those that were trained in this way (all Roberta models save for BabyBERTa). The 'MLM" method does not handicap models trained with predicting unmasked tokens, because it uses mask symbols to compute scores, which ensures that a model never has access to information in the input about what word it should predict.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
images		images
results		results
scripts		scripts
test_suites		test_suites
unmasked		unmasked
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

results

results

scripts

scripts

test_suites

test_suites

unmasked

unmasked

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Test Suites

Models

Scoring Methods

About

Languages

License

phueb/UnMasked

Folders and files

Latest commit

History

Repository files navigation

Test Suites

Models

Scoring Methods

About

Topics

Resources

License

Stars

Watchers

Forks

Languages