EBM-NLP

This is the part of the class project for CSC 585 - Algorithms in NLP - at University of Arizona.

This is a multi-level named entity recognition problem. The dataset is described in the paper https://arxiv.org/abs/1806.04185 which contains multi-level annotations of patients, interventions and outcomes for medical literature.

There are two tasks for the two levels of annotations in the dataset. The first task is identification of P (patients), I (interventions), and O(outcomes) span. The baseline model implementation for this task uses a bidirectional LSTM with CRF tagger on top https://github.com/bepnye/EBM-NLP/ .

The second task is token level detailed labeling of the PIO spans. This baseline hierarchical labeling is performed using stanford Named Entity Recognizer https://nlp.stanford.edu/software/CRF-NER.html

The models can be evaluated using docker. The embeddings and trained models are downloaded from box university account for evaluation.

Nested Named Entity Recognition

Our approach is based on the paper "A neural layered model for nested named entity recognition" https://aclweb.org/anthology/N18-1131

The paper detects entities using a flat NER layer described in the paper from Lample etal. "Neural architectures for named entity recognition" https://arxiv.org/pdf/1603.01360.pdf starting from inner entities. On detecting the inner entities, they merge/average the contextual embedding for tokens in the inner entities, and feed them to another flat NER layer for outer entity detection. They stack the layers on top of each other until no new entities are detected.

Our Approach

Our approach is a modification of the previous approach from "A neural layered model for nested named entity recognition" https://aclweb.org/anthology/N18-1131 where we start the entity detection from outer entitities and move to inner entities. This is motivated by the fact that outer spans are more accurately predicted by baseline models and there is a higher agreement between annotators when labelling outer spans (i.e. pio spans) compared to detailed spans.

Models Trained Using

OS: MacOS High Sierra v 10.13.6 2.9 GHz Intel Core i5 16 GB DDR3 RAM

Docker Environment

OS: Ubuntu 16.04 Python 3.5

Steps to run docker code

git clone https://github.com/devkotasabin/EBM-NLP.git
cd EBM-NLP
sudo docker build -t devkota-sabin-hw3 .
sudo docker run --name devkotasabin devkota-sabin-hw3

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
acl_scripts		acl_scripts
dev_test_results		dev_test_results
layered-bilstm-crf		layered-bilstm-crf
nlp_project		nlp_project
stanford-ner		stanford-ner
.dockerignore		.dockerignore
Dockerfile		Dockerfile
README.md		README.md
README.txt		README.txt
ebm_nlp_1_00.tar.gz		ebm_nlp_1_00.tar.gz
ebm_nlp_demo.py		ebm_nlp_demo.py
eval.sh		eval.sh
runall.sh		runall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

acl_scripts

acl_scripts

dev_test_results

dev_test_results

layered-bilstm-crf

layered-bilstm-crf

nlp_project

nlp_project

stanford-ner

stanford-ner

.dockerignore

.dockerignore

Dockerfile

Dockerfile

README.md

README.md

README.txt

README.txt

ebm_nlp_1_00.tar.gz

ebm_nlp_1_00.tar.gz

ebm_nlp_demo.py

ebm_nlp_demo.py

eval.sh

eval.sh

runall.sh

runall.sh

Repository files navigation

EBM-NLP

Nested Named Entity Recognition

Our Approach

Models Trained Using

Docker Environment

Steps to run docker code

About

Releases

Packages

Languages

devkotasabin/EBM-NLP

Folders and files

Latest commit

History

Repository files navigation

EBM-NLP

Nested Named Entity Recognition

Our Approach

Models Trained Using

Docker Environment

Steps to run docker code

About

Resources

Stars

Watchers

Forks

Languages