Skip to content

IAmS4n/TextGenerationEvaluationMetrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Jointly Measuring Diversity and Quality in Text Generation Models

This is the implementation of metrics for measuring Diversity and Quality, which are introduced in this paper. Besides, some other metrics exist.

For BLEU and Self-BLEU, this hyperformance implementation is used.

Sample Usage

Multiset distances

Here is an example to compute MS-Jaccard distance. The input of these metrics is a list of tokenized sentences.

from multiset_distances import MultisetDistances

ref1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
ref2 = ['It', 'is', 'the', 'guiding', 'principle', 'which', 'guarantees', 'the', 'military', 'forces', 'always', 'being', 'under', 'the', 'command', 'of', 'the', 'Party']
ref3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the', 'army', 'always', 'to', 'heed', 'the', 'directions', 'of', 'the', 'party']
sen1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', 'that', 'the', 'military', 'always', 'obeys', 'the', 'commands', 'of', 'the', 'party']
sen2 = ['he', 'read', 'the', 'book', 'because', 'he', 'was', 'interested', 'in', 'world', 'history']

references = [ref1, ref2, ref3]
sentences = [sen1, sen2]

msd = MultisetDistances(references=references)
msj_distance = msd.get_jaccard_score(sentences=sentences)

The value of msj_distance is {3: 0.17, 4: 0.13, 5: 0.09}, which shows MS-Jaccard for 3-gram, 4-garm and 5-gram, respectively.

BERT based distances

Here is an example to compute FBD and EMBD distance. The input of these metrics is a list of strings, and BERT tokenizer is used in the code.

from bert_distances import FBD, EMBD
references = ["that is very good", "it is great"]
sentences1 = ["this is nice", "that is good"]
sentences2 = ["it is bad", "this is very bad"]

fbd = FBD(references=references, model_name="bert-base-uncased", bert_model_dir="/tmp/Bert/")
fbd_distance_sentences1 = fbd.get_score(sentences=sentences1)
fbd_distance_sentences2 = fbd.get_score(sentences=sentences2)
# fbd_distance_sentences1 = 17.8, fbd_distance_sentences2 = 22.0

embd = EMBD(references=references, model_name="bert-base-uncased", bert_model_dir="/tmp/Bert/")
embd_distance_sentences1 = embd.get_score(sentences=sentences1)
embd_distance_sentences2 = embd.get_score(sentences=sentences2)
# embd_distance_sentences1 = 10.9, embd_distance_sentences2 = 20.4

Resources

Citation

Please cite our paper if it helps with your research.

@misc{montahaei2019jointly,
    title={Jointly Measuring Diversity and Quality in Text Generation Models},
    author={Ehsan Montahaei and Danial Alihosseini and Mahdieh Soleymani Baghshah},
    year={2019},
    eprint={1904.03971},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}