Skip to content

loaiabdalslamdahy/Data-Analysis-Course

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quick Start

The workshop code is available as Jupyter notebooks. You can run the notebooks in the cloud (no installation required) by clicking the "launch binder" button:

Binder](https://mybinder.org/v2/gh/loaiabdalslamdahy/Data-Analysis-Course/master?filepath=notebooks)

Why

For people who struggle to start in data analysis with Python

Description

This hands-on in-person workshop is based on Data Analysis with Python Course by IBM Cognitive Class

Learn how to analyze data using multi-dimensional arrays in NumPy and manipulate DataFrames in pandas using Jupyter-based environment.

The workshop will cover core topics:

  • Understanding the Domain
  • Understanding the Dataset
  • Python package for data science
  • Importing and Exporting Data in Python
  • Basic Insights from Datasets
  • Identify and Handle Missing Values
  • Data Formatting
  • Data Normalization Sets
  • Binning
  • Indicator variables
  • Descriptive Statistics
  • Basic of Grouping
  • ANOVA
  • Correlation
  • Simple and Multiple Linear Regression
  • Model Evaluation Using Visualization
  • Polynomial Regression and Pipelines
  • R-squared and MSE for In-Sample Evaluation
  • Prediction and Decision Making
  • Model Evaluation
  • Over-fitting, Under-fitting and Model Selection
  • Ridge Regression
  • Grid Search
  • Model Refinement

Prerequisite

Pre-workshop

You will need a laptop that can access the internet

1: Installation

Install miniconda or install the (larger) Anaconda distribution

Install Python 3.7 using Miniconda

OR Install Python 3.7 using Ananconda

2: Setup

2.1: Download workshop code & materials

Clone the repository

git clone git@github.com:aymanibrahim/dapy.git

OR Download the repository as a .zip file

2.2: Change directory to pyds

Change current directory to dapy directory

cd dapy

2.3: Install Python with required packages

Install Python 3.7 with the required packages into an environment named dapy as per environment.yml YAML file.

conda env create -f environment.yml

When conda asks if you want to proceed, type "y" and press Enter.

3: Activate environment

Change the current default environment (base) into dapy environment.

conda activate dapy

4: Install & Enable ipywidgets extentions

Enable ipywidgets Jupyter Notebook extension

jupyter contrib nbextension install --user
jupyter nbextension enable --py widgetsnbextension
jupyter nbextension enable python-markdown/main

# Notebooks w/ extensions that auto-run code must be "trusted" to work the first time
jupyter trust ./notebooks/05_Model_Evaluation.ipynb

Install ipywidgets JupyterLab extension

jupyter labextension install @jupyter-widgets/jupyterlab-manager

Enable widgetsnbextension

jupyter nbextension enable --py widgetsnbextension --sys-prefix

5: Check installation

Use check_environment.py script to make sure everything was installed correctly, open a terminal, and change its directory (cd) so that your working directory is the workshop directory dapy you cloned or downloaded. Then enter the following:

python check_environment.py

If everything is OK, you will get the following message:

Your workshop environment is set up

6: Start JupyterLab

Start JupyterLab using:

jupyter lab

JupyterLab will open automatically in your browser.

You may access JupyterLab by entering the notebook server’s URL into the browser.

7: Stop JupyterLab

Press CTRL + C in the terminal to stop JupyterLab.

8: Deactivate environment

Change the current environment (dapy) into the previous environment.

conda deactivate

Workshop Instructor

Ayman Ibrahim, PMP

References

Contributing

Thanks for your interest in contributing! There are many ways to contribute to this project. Get started here.

License

Workshop Code

License: MIT

Workshop Materials

Creative Commons License

Data Analysis with Python Workshop by Ayman Ibrahim and Loaii abdalslam is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at IBM Cognitive Class Data Analysis with Python by Joseph Santarcangelo, PhD. and Mahdi Noorian, PhD.

About

Specifying a conda environment with `environment.yml`

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%