Skip to content

booleanhunter-tech-blog/data-cleaning-tutorial

Repository files navigation

Data Cleaning: How the pros do the dirty work

This repository contains hosts code and files for the tutorial "Data Cleaning: How the pros do the dirty work".

In this tutorial, you'll dive straight into the trenches of transforming messy, real-world data into something pristine and usable.

Instructions to install:

The easiest way to get started on the tutorial is by clicking on this binder link:

Binder

  • Install Python3.x and pip
  • Create a virtual environment python3 -m venv data-cleaning-tutorial-env
  • Activate virtual environment source data-cleaning-tutorial-env/bin/activate
  • Install dependencies pip install -r requirements.txt
  • Run Jupyter-lab on the project root jupyter-lab
  • Open index.ipynb, follow the instructions, and run the code!
  • Follow the tutorial!

About

A tutorial on performing data cleaning using python, numpy and pandas

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages