PICO-data for Participants

This repo contains raw annotaions for PICO dataset used in paper:

Aggregating and Predicting Sequence Labels from Crowd Annotations An Thanh Nguyen, Byron C. Wallace, Junyi Jessy Li, Ani Nenkova and Matthew Lease Association for Computational Linguistics (ACL), 2017.

A SDK and sample codes are provided for retrieving the annotations.

Description

The dataset is in annotations/, it is splited into 4 parts:

train/ contains random selected 3549 abstracts.
dev/ contains random selected 500 abstracts.
test/ contains random selected 500 abstracts.
acl17-test contains 191 abstarcts with annotations by a medical student.

In each folder:

PICO-annos-crowdsourcing.json contains annotations from crowd sourced workers.
PICO-annos-crowdsourcing-agg.json contains aggregated results from crowd sourced annotations. The aggregation methods are described in Aggregating and Predicting Sequence Labels from Crowd Annotations.:
PICO-annos-professional.json for acl17-test only, contains annotations from a medical student.

Environment and dependencies:

Python 2
spaCy for basic tokenization etc
Sample code in src/examples/ folder

cd src
python -m examples.load_annotation

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
annotations		annotations
docs		docs
predictions		predictions
seq-models		seq-models
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

annotations

annotations

docs

docs

predictions

predictions

seq-models

seq-models

src

src

README.md

README.md

Repository files navigation

PICO-data for Participants

Description

Environment and dependencies:

About

Releases

Packages

Contributors 2

Languages

yinfeiy/PICO-data

Folders and files

Latest commit

History

Repository files navigation

PICO-data for Participants

Description

Environment and dependencies:

About

Resources

Stars

Watchers

Forks

Languages