Kaggle dataset used in the project can be found here: https://www.kaggle.com/uciml/pima-indians-diabetes-database
The aim of the project is to predict whether a person will develop diabetes based on several factors. The project uses three classification machine learning algorithms:
- Logistic regression
- Decision tree
- Random forest
The project includes some common techniques used while working with machine learning algorithms:
- Exploratory data analysis (EDA)
- Visualizations
- Feature engineering
- Splitting the dataset into training and test sets
- Standardization
- Principal component analysis (PCA)
- Pipelines
- Cross validation
- Hyperparameter tuning
- Model assessment