Skip to content

DishaMukherjee/Investigate-a-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Project II: Investigate a Dataset

Table of Contents

Installation

You need to be able to work in a Jupyter Notebook on your computer. The following packages (libraries) need to be installed. You can install these packages via conda or pip.

  • Pandas
  • Matplotlib
  • Numpy
  • CSV

Project Motivation

In this project, we have to go through the data analysis process and see how everything fits together. I have also use the Python libraries NumPy, pandas, and Matplotlib, which make writing data analysis code in Python a lot easier!

Project Overview

In this project, we have to analyze a dataset and then communicate our findings about it. We will use the Python libraries NumPy, pandas, and Matplotlib to make your analysis easier.

Project Outcome

After completing the project, I have learned following :

  • Know what all the steps involved in a typical data analysis process
  • Be comfortable posing questions that can be answered with a given dataset and then answering those questions
  • Know how to investigate problems in a dataset and wrangle the data into a format you can used
  • Have practice communicating the results of your analysis
  • Be able to use vectorized operations in NumPy and pandas to speed up your data analysis code
  • Be familiar with Pandas Series and DataFrame objects, which let you access your data more conveniently
  • Last but not least know how to use Matplotlib and Seaborn to produce plots showing findings.

Licensing, Authors, Acknowledgements

Must give credit to Kaggle for the data. You can find the Licensing for the data and other descriptive information at the Udacity Webpage.