Skip to content

elleros/DSHealth2019_loinc_embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LOINC Embeddings

This repository provides the Python code and the Word2Vec embeddings to reproduce the scatter plots in the KDD 2019 DSHealth Workshop paper "Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center": arxiv.org/abs/1907.09600. The code can be used as a starting point for further in depth explorations of the embeddings.

The embeddings have been trained via Word2Vec skip-gram with EHR data from the City of Hope National Medical Center. See paper for details on the training.

If you produce interesting visualizations of the embeddings, email me at lorenzo [dot] rossi [at] gmail.com (lrossi [at] coh.org).

Citation

If you use the material in your work, please cite our paper. BibTeX entry:

@inproceedings{larossi2019evaluation,
  title={Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center},
  author={Rossi, Lorenzo A and Shawber, Chad and Munu, Janet and Zachariah, Finly},
  booktitle={KDD Workshop on Applied Data Science for Healthcare (DSHealth)},
  year={2019}
}

Contents

  • tsne_plot_kdd_dshealth2019.ipynb Jupyter notebook to generate the t-SNE plot.
  • Data/ folder containing the Word2Vec embeddings for the LOINC codes as well as other files used to produce the t-SNE plot

Note you need to download the official LOINC CSV Table from loing.org. The table file is necessary to provide a taxonomy of the LOINC codes and hence the classes showed in different colors in the scatter plot.

How to download the official Loinc Table CSV file

  • Create an account on LOINC.org (it's free) and log in
  • Click on 'Downloads' (menu at the top of the page)
  • Click on 'LOINC Table'
  • Click on 'LOINC Table File (CSV)'
  • Review and check the the Copyright and Terms of Use note
  • Click on 'Download': a zip archive will be downloaded on your machine
  • Extraxct Loinc.csv from the archive

About

Code and Word2Vec embeddings of LOINC codes for KDD 2019 DSHealth paper "Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center": https://arxiv.org/abs/1907.09600

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published