Skip to content

Heart disease is a major global health concern that affects millions of people around the world. Early detection and accurate prediction of heart disease can help to prevent the progression of the disease and save lives. In this project, we aim to develop a predictive model for heart disease using various machine learning algorithms.

Notifications You must be signed in to change notification settings

shrijayan/Heart-Disease-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Heart Disease Prediction

In this model we have predicted the risk of heart disease based on real dataset that we got from a medical university chennai.

DATASET

The dataset we used contians more than 150 patient records and their daily routine data. As a result it helps our model to train with realtime data.Our data set contains many features like age,Dietary hahbit,Alcohol consumer or not etc...

Dataset : https://archive.ics.uci.edu/ml/datasets/Heart+Disease

Data Preprocessing:

We have preprocessed our dataset using Sk-learn. sk-learn:

The sklearn.preprocessing package provides several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. In general, learning algorithms benefit from standardization of the data set.

Models Used:

  1. SVM
  2. Naive Bayes
  3. Logistic Regression
  4. Decision Tree
  5. Random Forest
  6. LightGBM
  7. XGboost

Model's Result Prediction Percentage:

SVM: Support Vector Machine

    ~Training Set Prediction : 0.6694214876033058
    
    ~Testing Set Prediction : 0.5737704918032787

Naive Bayes: Works based on Naive Bayes algorithm

    ~Training Set Prediction : 0.8677685950413223
    
    ~Testing Set Prediction :  0.7868852459016393

Logistic Regression: Logistic regression estimates the probability of an disease based on a given dataset of independent variables

    ~Training Set Prediction : 0.8636363636363636
    
    ~Testing Set Prediction : 0.8032786885245902

Decision Tree: A decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks.

    ~Training Set Prediction : 1.0     
    
    ~Testing Set Prediction :  0.7704918032786885 

Random Forest: Works based on Naive Bayes algorithm

    ~Training Set Prediction : 1.0
    
    ~Testing Set Prediction : 0.7704918032786885

LightGBM: Gradient Boosting Decision Tree (GBDT) algorithm with the addition of two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB).

    ~Training Set Prediction : 0.9958677685950413
    
    ~Testing Set Prediction :0.7704918032786885

XGBoost: It builds a decision tree for a given boosting iteration, one level at a time, processing the entire dataset concurrently on the GPU.

    ~Training Set Prediction : 0.987603305785124
    
    ~Testing Set Prediction : 0.7540983606557377      

CONCLUDED MODEL FROM PREDICTION From the prediction results Logistic Regression is more suitable for both training and testing data .

About

Heart disease is a major global health concern that affects millions of people around the world. Early detection and accurate prediction of heart disease can help to prevent the progression of the disease and save lives. In this project, we aim to develop a predictive model for heart disease using various machine learning algorithms.

Topics

Resources

Stars

Watchers

Forks