Biomedical Interpretable Entity Representations

Biomedical Interpretable Entity Representations
Diego Garcia-Olano, Yasumasa Onoe, Ioana Baldini, Joydeep Ghosh, Byron Wallace, Kush Varshney
Findings of ACL 2021

Paper: [ https://aclanthology.org/2021.findings-acl.311/ ]
ACL slides: [ pdf ]

@inproceedings{garcia-olano-etal-2021-biomedical,
    title = "Biomedical Interpretable Entity Representations",
    author = "Garcia-Olano, Diego  and
      Onoe, Yasumasa  and
      Baldini, Ioana  and
      Ghosh, Joydeep  and
      Wallace, Byron  and      
      Varshney, Kush",
    booktitle = "Findings of the 59th Annual Meeting of the Association for Computational Linguistics",
    year = "2021",
    publisher = "Association for Computational Linguistics",
}

To use pre-trained models without re-training BIERS, see colab notebooks in "Replicating downstream tasks" section at bottom.

Installing Dependencies

$ git clone https://github.com/diegoolano/biomedical_interpretable_entity_representations.git
$ virtualenv --python=~/envs/py37/bin/python biomed_env
$ source biomed_env/bin/activate
$ pip install -r requirements.txt

How to train BioMed IER models

See ier_model/train.sh

   Make sure to: 
   - set goal to "medwiki", 
   - set training and dev sets, 
   - set paths in transformers\_constants.py appropriately, 
   - make sure to use a GPU with a lot of memory ( ie v100 has 32GB) or lower the batch size.
   - set the intervals on which you'd like to get training acc, eval acc on dev, etc
   - set log location

BIER training data and best models

BIER triples can be found [ here ]

Model files:

BIER-PubMedBERT: [ model ckpt ]
BIER-SciBERT: [ model ckpt ]
BIER-BioBERT: [ model ckpt ]

See prior section for how to train BIER models using training data

See Colabs below for how to load and use models on downstream tasks

Replicating downstream task results

See experiments/README.md for baselines

Clinical NED task using EHR dataset:
Entity Linking Classification on Cancer Genetics dataset:

Connecting PubMed entities to Wiki Categories through UMLS to generate training data

after generating (mention, context, categories) triples we then learn BIERs as follows:

BioMed IER architecture for learning biomed entity representations with interpretable components

after learning BIERs we can test their efficacy in a Zeroshot capacity for different biomed tasks

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
experiments		experiments
figs		figs
ier_model		ier_model
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

experiments

experiments

figs

figs

ier_model

ier_model

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Biomedical Interpretable Entity Representations

Installing Dependencies

How to train BioMed IER models

BIER training data and best models

Replicating downstream task results

Connecting PubMed entities to Wiki Categories through UMLS to generate training data

BioMed IER architecture for learning biomed entity representations with interpretable components

Zeroshot results for varying amounts of supervision

About

Releases

Packages

Contributors 2

Languages

License

diegoolano/biomedical_interpretable_entity_representations

Folders and files

Latest commit

History

Repository files navigation

Biomedical Interpretable Entity Representations

Installing Dependencies

How to train BioMed IER models

BIER training data and best models

Replicating downstream task results

Connecting PubMed entities to Wiki Categories through UMLS to generate training data

BioMed IER architecture for learning biomed entity representations with interpretable components

Zeroshot results for varying amounts of supervision

About

Resources

License

Stars

Watchers

Forks

Languages