Make the best model to predict heart attack for patients using machine learning. Three types of models are used: Logistic Regression, Support Vector Machines, Decision Tree and the results will be compared the accuracy and F1-score to determine the best model.
- Machine Learning
- Logistic Regression
- Support Vector Machines
- Decision Tree
- Python
- skicit-learn
- matplotlib
- pandas
- numpy
- seaborn
- It will be checked whether the data is ready to use by checking
- Missing value
- Outliers
- Perform feature selection by using Pearson Correlation or comparing the correlation variable with the target variable, namely output
- Visualize the correlation between variables with a heat map
- Calculate the mean correlation and take the features with a greater correlation than the mean
- Extract these features into data X and the target variable as data Y
- Split the data into training data and testing data with a ratio of 8:2
- Will use **Logistic Regression, Deision Tree, and Support Vector Machine **with
- Choose the best parameters by hypertuning
- Print reports from the Logistic Regression model
Conclusions are reached by analyzing the test and train scores that have been obtained for each model, it is obtained
- The Logistic Regression model has the highest test score but there is a significant difference from the train score, which allows overfitting to occur.
- The Decision Tree model produces a train score of 1 so that it allows overfitting to occur
- The SVM model has a fairly high test score and when compared to the train score there is not too much difference so the possibility of overfitting is low So, the best model to predict this dataset is Support Vector Machine Method.