Skip to content
/ GenDef Public

Probing task; contextual embeddings -> textual definitions (EMNLP19)

Notifications You must be signed in to change notification settings

MiuLab/GenDef

Repository files navigation

GenDef; PyTorch Implementation

Python 3.6 PyTorch 1.1.0

This repository contains the official PyTorch implementation of the following paper:

"What Does This Word Mean? Explaining Contextualized Embeddings with Natural Language Definition", EMNLP-IJCNLP 2019
Ting-Yun Chang, Yun-Nung Chen

Abstract: Contextualized word embeddings have boosted many NLP tasks compared with classic word embeddings. However, the word with a specific sense may have different contextualized embeddings due to its various contexts. To further investigate what contextualized word embeddings capture, this paper analyzes whether they can indicate the corresponding sense definitions and proposes a general framework that is capable of explaining word meanings given contextualized word embeddings for better interpretation. The experiments show that both ELMo and BERT embeddings can be well interpreted via a readable textual form, and the findings may benefit the research community for better understanding what the embeddings capture.

Demo website: http://140.112.29.239:5000/

Download the Pre-trained Network: https://miulab.myDS.me:5001/sharing/nkV8UPN2s

Paper: https://www.aclweb.org/anthology/D19-1627.pdf

Long Version: https://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0001-1902202000330000

Before Training

Encode the Definitions

https://tfhub.dev/google/universal-sentence-encoder-large/3

Extract Features from BERT

https://pypi.org/project/pytorch-pretrained-bert/

Extract Features from ELMo

https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md

Get Pretrained Static Word Embedding

https://fasttext.cc/docs/en/english-vectors.html

Train

BERT-base
$ python3 main.py --model_type BERT_base --emb1_dim 768 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH
BERT-large
$ python3 main.py --model_type BERT_large --emb1_dim 1024 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH
ELMo
$ python3 main.py --model_type ELMo --emb1_dim 1024 --n_feats 3 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH
Baseline
$ python3 main.py --model_type baseline --emb1_dim 812 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH

Evaluation

Test and view the mapping result

BERT-base
$ python3 main.py --test --model_type BERT_base --emb1_dim 768 --test_ctxVec YOUR_PATH --visualize
BERT-large
$ python3 main.py --test --model_type BERT_large --emb1_dim 1024 --test_ctxVec YOUR_PATH --visualize
ELMo
$ python3 main.py --test --model_type ELMo --emb1_dim 1024 --n_feats 3 --test_ctxVec YOUR_PATH --visualize
Baseline
$ python3 main.py --test --model_type baseline --emb1_dim 812 --test_ctxVec YOUR_PATH --visualize

Test Online

$ python3 online_inference.py --auto --model_type [baseline, ELMo, BERT_base, BERT_large]

Sort the result

$ python3 sort_result.py logs/YOUR_FILENAME.txt

Get Average Scores

$ python3 avg_score.py logs/YOUR_FILENAME.txt

Get BLEU/ ROUGE Scores

$ python3 get_bleu_rouge.py logs/YOUR_FILENAME.txt

About

Probing task; contextual embeddings -> textual definitions (EMNLP19)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages