Skip to content

ValentinaPy/QADiscourse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

QADiscourse

Discourse Relations as QA Pairs

This repository contains code and data accompanying the EMNLP 2020 paper: "QADiscourse - Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines"

QADiscourse Dataset

All data is presented in a .tsv format, with each row corresponding to one QA pair:

0. **qasrl_id** Same sentence IDs that are used by QASRL and QAMR.
1. **sentence** Tokenized sentence.
2. **worker_id** Anonymized worker ID.
3. **full_question** Full question where we joined the question start with the auxiliary or _ if no auxiliary was added and the question body. Not preprocessed.
4. **full_answer** Answer to the question.
5. **question_start** Question Prefix.
6. **question_aux** Auxiliary or _ if none.
7. **question_body** Question body.
8. **answer** Answer.
9. **untokenized sentence** Same sentence as in 1. but untokenized.
10. **target indices for untok sent** Target indices extracted for a specific sentence using heuristics, for the untokenized sentence.

Terms of Use

@inproceedings{pyatkin2020qadiscourse,
  title={QADiscourse-Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines},
  author={Pyatkin, Valentina and Klein, Ayal and Tsarfaty, Reut and Dagan, Ido},
  booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  pages={2804--2819},
  year={2020}
}

About

Discourse Relations as QA Pairs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published