GitHub - MiuLab/GenDef: Probing task; contextual embeddings -> textual definitions (EMNLP19)

GenDef; PyTorch Implementation

This repository contains the official PyTorch implementation of the following paper:

"What Does This Word Mean? Explaining Contextualized Embeddings with Natural Language Deﬁnition", EMNLP-IJCNLP 2019
Ting-Yun Chang, Yun-Nung Chen

Abstract: Contextualized word embeddings have boosted many NLP tasks compared with classic word embeddings. However, the word with a speciﬁc sense may have different contextualized embeddings due to its various contexts. To further investigate what contextualized word embeddings capture, this paper analyzes whether they can indicate the corresponding sense deﬁnitions and proposes a general framework that is capable of explaining word meanings given contextualized word embeddings for better interpretation. The experiments show that both ELMo and BERT embeddings can be well interpreted via a readable textual form, and the ﬁndings may beneﬁt the research community for better understanding what the embeddings capture.

Demo website: http://140.112.29.239:5000/

Download the Pre-trained Network: https://miulab.myDS.me:5001/sharing/nkV8UPN2s

Paper: https://www.aclweb.org/anthology/D19-1627.pdf

Long Version: https://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0001-1902202000330000

Before Training

Train

BERT-base

$ python3 main.py --model_type BERT_base --emb1_dim 768 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH

BERT-large

$ python3 main.py --model_type BERT_large --emb1_dim 1024 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH

ELMo

$ python3 main.py --model_type ELMo --emb1_dim 1024 --n_feats 3 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH

Baseline

$ python3 main.py --model_type baseline --emb1_dim 812 --train_ctxVec YOUR_PATH --val_ctxVec YOUR_PATH

Evaluation

Test and view the mapping result

BERT-base

$ python3 main.py --test --model_type BERT_base --emb1_dim 768 --test_ctxVec YOUR_PATH --visualize

BERT-large

$ python3 main.py --test --model_type BERT_large --emb1_dim 1024 --test_ctxVec YOUR_PATH --visualize

ELMo

$ python3 main.py --test --model_type ELMo --emb1_dim 1024 --n_feats 3 --test_ctxVec YOUR_PATH --visualize

Baseline

$ python3 main.py --test --model_type baseline --emb1_dim 812 --test_ctxVec YOUR_PATH --visualize

Test Online

$ python3 online_inference.py --auto --model_type [baseline, ELMo, BERT_base, BERT_large]

Sort the result

$ python3 sort_result.py logs/YOUR_FILENAME.txt

Get Average Scores

$ python3 avg_score.py logs/YOUR_FILENAME.txt

Get BLEU/ ROUGE Scores

$ python3 get_bleu_rouge.py logs/YOUR_FILENAME.txt

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
avg_score.py		avg_score.py
get_bleu_rouge.py		get_bleu_rouge.py
load.py		load.py
main.py		main.py
models_utils.py		models_utils.py
online_inference.py		online_inference.py
sort_result.py		sort_result.py

MiuLab/GenDef

Folders and files

Latest commit

History

Repository files navigation

GenDef; PyTorch Implementation

Before Training

Encode the Definitions

Extract Features from BERT

Extract Features from ELMo

Get Pretrained Static Word Embedding

Train

BERT-base

BERT-large

ELMo

Baseline

Evaluation

Test and view the mapping result

BERT-base

BERT-large

ELMo

Baseline

Test Online

Sort the result

Get Average Scores

Get BLEU/ ROUGE Scores

About

Topics

Resources

Stars

Watchers

Forks

Languages