Skip to content

BahramJannesar/PimaIndiansDiabetsMachineLearning

Repository files navigation

Machine Learning on Pima Diabets dataset

Diabetes is a chronic condition in which the body develops a resistance to insulin, a hormone which converts food into glucose. Diabetes affect many people worldwide and is normally divided into Type 1 and Type 2 diabetes. Both have different characteristics. This article intends to analyze and create a model on the PIMA Indian Diabetes dataset to predict if a particular observation is at a risk of developing diabetes, given the independent factors. This article contains the methods followed to create a suitable model, including EDA along with the model.

Who are Pima Indians

The Pima (/ˈpiːmə/) (or Akimel O'odham, also spelled Akimel Oʼotham, "River People", formerly known as Pima) are a group of Native Americans living in an area consisting of what is now central and southern Arizona. Wikipedia

Dataset

The dataset can be found on the Kaggle website. This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases and can be used to predict whether a patient has diabetes based on certain diagnostic factors.

Dataset columns

  • Pregnancies
  • Glucose
  • BloodPressure
  • SkinThickness
  • Insulin
  • BMI
  • DiabetesPedigreeFun
  • Age
  • Outcome

Clustering with KMeans Algorithm :

according to the correlation digaram as you can see , the best parameters for drawing and show how is the algorithm worked , is , Age , BMI , Glucose .

before clustering :

after clusterung :

License

See full license on this ,Under MIT License