Skip to content

deepaksithu/Investigate_a_Dataset_Project

Repository files navigation

Investigate a Dataset Project

This is a project for the Udacity Data Analyst Nanodegree. It explores a dataset from Kaggle for doctors visits in Brazil. The focus on this project is exploring what factors can affect the chances of a patient making or missing their appointment, focusing on the three independent variables of patient age, patient disability status, and whether the patient has received a text message reminder for the appointment.

Prerequisites

This code depends on the following libraries:
1.datetime
2.numpy
3.pandas
4.matplotlib.pyplot
5.seaborn

In addition to these, the Jupyter Notebook assumes that the Kaggle data has been downloaded, extracted, and saved as 'noshowappointments-kagglev2-may-2016.csv'.

Project Structure

Introduction
Data Wrangling

  • General Data Overview
  • Data Cleaning

Data Analysis

  • General
  • Patient Age
  • Disability Status
  • Text Reminders

Conclusions
Further Research
Limitations

To-do

  • improve visualizations
  • needs to be cleaned up and reviewed

About

Project for Udacity Data Analyst Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published