Skip to content

EDA is a must to do step in the data science workflow. Working on data, wrangling & transforming them is time consuming, and it determine the success degree of the next steps (pre preocessing, modelling, communicating outputs & decision making). This repo will show you how to perform EDA in R using the tidyverse ecosystem, and will introduce a c…

License

Yacine87/EDA_R_Packages

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EDA_R_Packages

EDA is a must to step in the data science workflow. Working on data, wrangling & transforming them is time consuming, and it determine the success degree of the next steps (pre preocessing, modelling, communicating outputs & decision making). This repo will show you how to perform EDA in R using the tidyverse ecosystem, and will introduce a comparative approach between the main packages in R whcich could let you perform automated EDA & generating automated EDA html or pdf reports, ready to be communicated.

Scope of work

We are going to compare here the most known (in my awareness) R packages dedicated to EDA & automated EDA.

Here is a non exhaustive list: The tidyverse: the most known & revolutionary packages' ecosystem (collection of packages) in R. Covering all the DS workflow. ## See https://www.tidyverse.org/ Note that the above packages have dependencies with the tidyverse's package as dplyr, ggplot2, etc.

SmartEDA # see https://github.com/daya6489/SmartEDA

dlookr # see https://github.com/choonghyunryu/dlookr

DataExplorer # https://cran.r-project.org/web/packages/DataExplorer/vignettes/dataexplorer-intro.html

Hmisc # see https://hbiostat.org/R/Hmisc/

exploreR # see https://cran.r-project.org/web/packages/exploreR/index.html

RtutoR # see https://cran.r-project.org/web/packages/RtutoR/index.html

summarytools # see https://cran.r-project.org/web/packages/summarytools/vignettes/Introduction.html

Packages installation

To install successfuly SmartEDA, dlookr, etc, you must install Rtools version 4.0 from https://cran.r-project.org/bin/windows/Rtools/

About

EDA is a must to do step in the data science workflow. Working on data, wrangling & transforming them is time consuming, and it determine the success degree of the next steps (pre preocessing, modelling, communicating outputs & decision making). This repo will show you how to perform EDA in R using the tidyverse ecosystem, and will introduce a c…

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages