EMNLP 2017 submission

This repository contains the dataset and statistical analysis code released with the submission of EMNLP 2017 paper "Why We Need New Evaluation Metrics for NLG".

File descriptions:

emnlp_data_individual_hum_scores.csv - the dataset with system outputs and evaluation ratings of 3 crowd-workers for each output
emnlp_data_individual_hum_scores.csv - the dataset with system outputs, original human references, scores of automatic metrics and medians of human ratings
analysis_emnlp.R - R code with statistical analysis discussed in the paper

Citing the paper:

Jekaterina Novikova, Ondrej Dusek, Amanda Cercas-Curry and Verena Rieser (2017): Why We Need New Evaluation Metrics for NLG. In Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2017, Copenhaged, Denmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

EMNLP 2017 submission

File descriptions:

Citing the paper:

Files

README.md

Latest commit

History

README.md

File metadata and controls

EMNLP 2017 submission

File descriptions:

Citing the paper: