Annotation methodology of QA resources #103

lintool · 2020-04-24T15:45:59Z

Hi there, thanks for sharing your QA resource!
https://github.com/deepset-ai/COVID-QA/tree/master/data/question-answering

I was wondering if you have a write-up of the annotation methodology? For example, how were the documents selected, how were the questions generated, guidelines for marking the extent of the spans, etc.

Thanks in advance!

Timoeller · 2020-04-25T10:23:22Z

Hey @lintool
thanks for looking into the annotations we open sourced. We really liked your work on BERTserini and How Dirk used OSS frameworks for a Cord 19 semantic search. Currently we are also working on better retrievers in our semantic search framework haystack.

About your question:

We have been using our own SQuAD-style annotation tool where annotators read a document, formulate questions about the content and highlight corresponding answers. Here you find an introductory video into the label tool and annotation process.
Annotations are done on a volunteering basis by medical experts (MSc or higher) and we are especially grateful to Anthony Reina for on-boarding new annotators and supervising the process.
The documents are a subset of CORD-19 papers that annotators deemed related to Covid. (Hopefully Tony can give more insights into the process?)

Can we somehow assist you in using these labels?

lintool · 2020-04-25T11:15:53Z

Hi @Timoeller - Thanks for your response. We've been working on building test collections also, but via slightly different approach: https://arxiv.org/abs/2004.11339

I was wondering if you'd be interested in more closely coordinating efforts? If so, let's connect directly over email?

tonyreina · 2020-04-25T20:37:43Z

Yes. We'd love to coordinate our efforts. Please reach out directly to either me (Tony) or Timo. Thanks so much.

lintool · 2020-04-26T10:39:10Z

What's your email? Or you can find mine on my website: https://cs.uwaterloo.ca/~jimmylin/index.html

Timoeller added the question Further information is requested label Apr 25, 2020

Timoeller self-assigned this Apr 25, 2020

aaronbriel mentioned this issue Oct 15, 2020

Document Retrieval for extractive QA with COVID-QA #108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotation methodology of QA resources #103

Annotation methodology of QA resources #103

lintool commented Apr 24, 2020

Timoeller commented Apr 25, 2020

lintool commented Apr 25, 2020

tonyreina commented Apr 25, 2020

lintool commented Apr 26, 2020

Annotation methodology of QA resources #103

Annotation methodology of QA resources #103

Comments

lintool commented Apr 24, 2020

Timoeller commented Apr 25, 2020

lintool commented Apr 25, 2020

tonyreina commented Apr 25, 2020

lintool commented Apr 26, 2020