Skip to content

joelvarma/Exploratory-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis

EDA is nothing but exlporing the data by using some means to infer rational insights to gain better understanding of the data. Data Science usually involves myriad number of techniques for EDA , in this post we will discuss most commonly used techniques.

Libraries we will be using for EDA

Pandas

Pandas is a great library for Data Science. It provides high level abstraction implementation for analysing the data. Click this link for its documentation : https://pandas.pydata.org/pandas-docs/stable/

Seaborn and Matplotlib

Seaborn is another important package for visualizing the data, it provides one line python functions to plot the data similiar to MatPlotLib in MATLab except Matplotlib is not useful in some cases for visualizing where seaborn compensates this lack. Seaborn has great visualising tools like Violinplots for making better inference from the data.

Why Violoinplot

Violin plots give you 25th, 50th(Median), 75th quartile of the data plus it gives you Probability Density Function(Khan academy gives you awesome explanation of what it is!)

Simple Violinplot:

alt text

Numpy

Numpy is Numerical Python library for doing high level math computations involving complex data structures like matrices

Your welcome to add any EDA techniques to this repo.

About

Exploratory Data Analysis - Univariate, Bivariate & Multi variate Analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published