Skip to content

Code base for the paper: Interpretable machine learning with tree-based shapley additive explanations: application to metabolomics datasets for binary classification

obifarin/shap-iml-metabolomics

Repository files navigation

shap-metabolomics

Code base for the study: "Interpretable machine learning with tree-based shapley additive explanations: application to metabolomics datasets for binary classification."

Link to Preprint

Requirements

1. git repository

Clone local copy of git repository

git clone https://github.com/obifarin/shap-iml-metabolomics

(or use a git GUI client of your choice)

2. Environment setup and access jupyter notebooks

Setup python environment (pls-da-shap.yml) in the terminal.

(or use anaconda GUI.)

3. Notebook contents

Name Description
01_Sex_MTBLS404.ipynb Discriminating biological sex via urine metabolomics.
02_HighFatDiet_MTBLS547.ipynb The impact of a high-fat diet on bile acids in the cecum.
03_Adenocarcinoma_ST000369.ipynb Detecting Adenocarcinoma via serum metabolomics.
04-pubmed-metabolomics.ipynb Keyword occurrences by year for partial least squares regression, random forest, and gradient boosting in metabolomics publications on PubMed.

Code base notes>

  • The anaconda environment for this work: pls-da-shap.yml
  • The important models are saved in the folder saved_models.
  • PyChemometrics is the folder for the library used in PLS-DA computation for this work. Some PLS-DA code wouldn't run if you don't have them in the same directory from which you run the code.
  • Data folder contains the raw data used in this study, as prepared by Mendez et al in his paper.

About

Code base for the paper: Interpretable machine learning with tree-based shapley additive explanations: application to metabolomics datasets for binary classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published