Skip to content

Developed a machine learning model using the Cleveland Heart Disease dataset to accurately predict heart disease presence in individuals based on 14 medical attributes. Conducted comprehensive data exploration, visualization, model selection, training, hyperparameter tuning, and evaluation. Identified crucial features to aid diagnosis and treatment

Notifications You must be signed in to change notification settings

Pratham1234github/HeartSense

Repository files navigation

(HeartSense) Heart Disease Prediction

image

Table of Contents

Introduction

Welcome to the Heart Disease Prediction project! This project aims to predict whether an individual has heart disease based on their medical attributes. Early detection and accurate diagnosis are crucial for effective treatment and management of heart disease. We will utilize machine learning techniques and data analysis to build a predictive model.

image

Problem

The problem we aim to solve is the classification of individuals into two groups: those with heart disease and those without it, using demographic, clinical, and laboratory measurements.

Data

We are using the Cleveland Heart Disease dataset, which contains 303 instances and 14 attributes. This dataset will be used for training and testing our predictive model.

Project Steps

  1. Data Exploration and Preparation: We will start by exploring the dataset, analyzing attribute distributions, checking for missing values, and performing data cleaning and feature selection.

  2. Data Visualization: Visualizing the data will help us gain insights and identify patterns. We'll use various visualization techniques, including scatter plots, histograms, and correlation matrices.

  3. Model Selection and Training: We will evaluate multiple machine learning algorithms, such as logistic regression, decision trees, random forests, and support vector machines, to determine the best-performing model. We'll train and evaluate the model using various metrics.

  4. Hyperparameter Tuning: Fine-tuning the model's hyperparameters will enhance its performance and prevent overfitting.

  5. Model Evaluation and Interpretation: We will evaluate the final model on the test set and interpret its predictions to identify the most important features contributing to the prediction of heart disease.

About the Dataset

The original dataset is from the UCI Machine Learning Repository. It contains the following attributes:

Data Dictionary

  • age: Age of the individual.
  • sex: Gender of the individual (1 = male, 0 = female).
  • cp: Chest-pain type (0 = typical angina, 1 = atypical angina, 2 = non-anginal pain, 3 = asymptotic).
  • trestbps: Resting Blood Pressure (in mmHg).
  • chol: Serum Cholesterol (in mg/dL).
  • fbs: Fasting Blood Sugar (1 if > 120mg/dL, 0 otherwise).
  • restecg: Resting Electrocardiographic Results (0 = normal, 1 = ST-T wave abnormality, 2 = left ventricular hypertrophy).
  • thalach: Max Heart Rate Achieved.
  • exang: Exercise Induced Angina (1 = yes, 0 = no).
  • oldpeak: ST Depression Induced by Exercise Relative to Rest.
  • slope: Slope of the Peak Exercise ST Segment (0 = upsloping, 1 = flat, 2 = downsloping).
  • ca: Number of Major Vessels (0-3) Colored by Fluoroscopy.
  • thal: Thalassemia (1,3 = normal, 6 = fixed defect, 7 = reversible defect).
  • target: Presence of Heart Disease (1 = yes, 0 = no).

Getting Started

If you want to run this project locally or contribute to it, follow these steps:

  1. Clone the repository:

    git clone https://github.com/your-username/your-repo-name.git
  2. Install the required dependencies:

    pip install -r requirements.txt
  3. Get the dataset from the UCI Machine Learning Repository and place it in the appropriate directory.

  4. Follow the project steps outlined in the code to build and evaluate the heart disease prediction model.

Usage

Provide instructions on how to use your project, including code examples for making predictions using the trained model.

# Example code for making predictions
import pandas as pd
from your_module import load_and_preprocess_data, predict_heart_disease

# Load and preprocess new data
new_data = pd.read_csv('new_data.csv')
processed_data = load_and_preprocess_data(new_data)

# Make predictions
predictions = predict_heart_disease(processed_data)

# Display the predictions
print(predictions)

Contributing

We welcome contributions! If you'd like to contribute to this project.

Acknowledgments

We would like to thank the UCI Machine Learning Repository for providing the Heart Disease dataset, and all the contributors to open-source machine learning libraries that made this project possible.

About

Developed a machine learning model using the Cleveland Heart Disease dataset to accurately predict heart disease presence in individuals based on 14 medical attributes. Conducted comprehensive data exploration, visualization, model selection, training, hyperparameter tuning, and evaluation. Identified crucial features to aid diagnosis and treatment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published