Skip to content

lujainibrahim/ecg-view-II-machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Explainable Prediction of Acute Myocardial Infarction using Machine Learning and Shapley Values

This repository is the official implementation of Explainable Prediction of Acute Myocardial Infarction using Machine Learning and Shapley Values published in IEEE Access in November 2020.

Requirements

pip3 install -r requirements.txt
  • To obtain the ECG ViEW II dataset, please use this form. After recieving the unprocessed files, follow the data processing steps below.

Data Processing

To process the ECG-ViEW II dataset as it is done in the paper (with robust scaling and SMOTE), run this notebook.

This notebook will produce two csv files, test.csv and train.csv, that you can then train/evaluate models with.

Training

  • To train the CNN model in the paper, run this notebook.
  • To train the RNN model in the paper, run this notebook.
  • To train the XGBoost model in the paper, run this notebook.

These notebooks will train the model and save it in a file that can be imported for evaluation later (described in the next section).

Evaluation

  • To evaluate the CNN on the processed ECG-ViEW II data, run this notebook.
  • To evaluate the RNN on the processed ECG-ViEW II data, run this notebook.
  • To evaluate the XGBoost on the processed ECG-ViEW II data, run this notebook.

To reproduce the results in the paper, use the pretrained models. Additionally, to train and evaluate models without the age and sex features, please see these folders (CNN, RNN).

Pre-trained Models

You can download pretrained models here: With age and sex:

  • CNN trained on ECG-ViEW II
  • RNN trained on ECG-ViEW II
  • XGBoost trained on ECG-ViEW II

Without age and sex:

  • CNN trained on ECG-ViEW II
  • RNN trained on ECG-ViEW II
  • XGBoost trained on ECG-ViEW II

Results

Our models achieve the following performances:

Model Accuracy F1 Score AUROC Sensitivity Specificity
CNN 89.9 % 89.0 % 90.7 % 88.1 % 93.2%
RNN 84.6 % 82.2 % 82.9 % 78.0 % 87.8 %
XGBoost 97.5 % 97.1 % 96.5 % 93.5 % 99.4 %

Shapley Analysis

Shapley analysis on the XGBoost model shows that age, ACCI, and QRS duration are the most crucial variables in the prediction of the onset of AMI, while sex is of relatively less importance. The Shapley analysis is shown to be a promising technique to uncover the intricacies and mechanisms of the prediction model, leading to higher degree of interpretation and transparency.

The local explanation summary (beeswarm) plot gives an overview of the impact of features on the prediction, with each dot representing the Shapley value of every feature for all samples.

The global feature importance plot shows the average absolute of the Shapley values over the whole testing dataset. Age (Birthyeargroup), ACCI, and QRS duration were observed to be the most important features for the prediction.

Contributing

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

About

This repository is the official implementation of Explainable Prediction of Acute Myocardial Infarction using Machine Learning and Shapley Values published in IEEE Access in November 2020.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published