Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation methodology of QA resources #103

Open
lintool opened this issue Apr 24, 2020 · 4 comments
Open

Annotation methodology of QA resources #103

lintool opened this issue Apr 24, 2020 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@lintool
Copy link

lintool commented Apr 24, 2020

Hi there, thanks for sharing your QA resource!
https://github.com/deepset-ai/COVID-QA/tree/master/data/question-answering

I was wondering if you have a write-up of the annotation methodology? For example, how were the documents selected, how were the questions generated, guidelines for marking the extent of the spans, etc.

Thanks in advance!

@Timoeller
Copy link
Contributor

Hey @lintool
thanks for looking into the annotations we open sourced. We really liked your work on BERTserini and How Dirk used OSS frameworks for a Cord 19 semantic search. Currently we are also working on better retrievers in our semantic search framework haystack.

About your question:

  • We have been using our own SQuAD-style annotation tool where annotators read a document, formulate questions about the content and highlight corresponding answers. Here you find an introductory video into the label tool and annotation process.
  • Annotations are done on a volunteering basis by medical experts (MSc or higher) and we are especially grateful to Anthony Reina for on-boarding new annotators and supervising the process.
  • The documents are a subset of CORD-19 papers that annotators deemed related to Covid. (Hopefully Tony can give more insights into the process?)

Can we somehow assist you in using these labels?

@Timoeller Timoeller added the question Further information is requested label Apr 25, 2020
@Timoeller Timoeller self-assigned this Apr 25, 2020
@lintool
Copy link
Author

lintool commented Apr 25, 2020

Hi @Timoeller - Thanks for your response. We've been working on building test collections also, but via slightly different approach: https://arxiv.org/abs/2004.11339

I was wondering if you'd be interested in more closely coordinating efforts? If so, let's connect directly over email?

@tonyreina
Copy link

Yes. We'd love to coordinate our efforts. Please reach out directly to either me (Tony) or Timo. Thanks so much.

@lintool
Copy link
Author

lintool commented Apr 26, 2020

What's your email? Or you can find mine on my website: https://cs.uwaterloo.ca/~jimmylin/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants