Skip to content

The "US Medical Insurance Costs" project explores and analyzes a dataset containing medical insurance costs for patients in the United States. The project was completed as part of the Codecademy Data Science Career Path.

Notifications You must be signed in to change notification settings

haissaoui/Portfolio-Project-US-Medical-Insurance-Costs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Portfolio-Project-US-Medical-Insurance-Costs

This is a portfolio project for the Codecademy Data Science Career Path. The project aims to analyze a dataset of US medical insurance costs.

The project can be viewed on Kaggle here.

Tools and Techniques

This project was completed using Python and the following libraries:

  • Pandas
  • Matplotlib

Sections:

Importing and exploring the insurance dataset

The first section of the project focuses on importing the dataset, exploring the data types, and finding the basic statistics for each column in the dataset. The code uses a try-except block to handle the different working directories in Kaggle and GitHub.

Analysis of Insurance Dataset

The second section of the project analyzes the dataset using various visualizations to explore the relationships between the variables. This includes histograms, scatterplots, and bar charts.

Studying Correlations

The third section of the project studies the correlation between variables using the Pearson correlation coefficient. This section aims to find out which variables have the strongest correlation with each other.

Potential Areas of Bias in the dataset

The final section of the project looks at potential areas of bias in the dataset. This includes looking at the distribution of individuals in different regions and the impact of smoking on insurance costs.

Feel free to check out the project on Kaggle and provide any feedback or suggestions!

About

The "US Medical Insurance Costs" project explores and analyzes a dataset containing medical insurance costs for patients in the United States. The project was completed as part of the Codecademy Data Science Career Path.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published