Skip to content

kernelmethod/Python-Data-Science-Workshop

Repository files navigation

Python Data Science Workshop

LISA logo

Jupyter Notebooks and code for the Spring 2019 Python Data Science Workshop at CU Boulder. Sponsored by the Laboratory for Interdisciplinary Statistical Analysis (LISA).

Courses

This repository contains Jupyter Notebooks for the following short courses:

Python Data Wrangling

This course teaches some of the building blocks for handling data in Python. We will show how to parse and manipulate data using Numpy and Pandas, and perform interactive visualization with Plotly. We analyze a real dataset using Scipy, demonstrating how to perform basic statistical tasks in Python.

Notebook: 1. Introduction to Data Science in Python.ipynb


Intro to Machine Learning in Python with Scikit-learn

This course introduces the Scikit-learn library for doing machine learning in Python. Students will start by learning about support vector machines, and gradually explore how Scikit-learn allows you to build a full machine learning pipeline, from feature extraction all the way through to prediction.

Notebook: 2. Intro to Machine Learning in Python with Scikit-learn.ipynb


Building Neural Networks with Keras

Neural networks are becoming an increasingly important tool in machine learning. In this short course, we demonstrate how to rapidly prototype an artificial neural network (ANN) in Python using the Keras library. We briefly introduce ANNs, including important variations like convolutional networks. Using Keras, students will build their own networks for some basic machine learning problems.

Notebook: 3. Building Neural Networks With Keras.ipynb

Workshop pre-requisites

For the first workshop, you will need some basic knowledge of Python. It would also help to have at least a little familiarity with Jupyter Notebook (e.g. you should know how to run code cells).

For the second and third workshops you will also need to understand numpy (which is taught in the first workshop). Knowing matplotlib is a plus but not required.

Required software

In order to participate in this workshop, you have to be able to run a Jupyter Notebook in your web browser. Unless you are using mybinder, you will also need to install some libraries. Below I have provided a few different methods for running these notebooks and installing the required libraries:

  1. Using mybinder
  2. Using Anaconda
  3. By installing Jupyter Notebook through pip.
  4. With Docker.
  5. Through CU Boulder's JupyterHub server

mybinder

mybinder is an online service that will create an environment for you to run the Jupyter Notebook, and then allow you to run that notebook in your browser. Note: mybinder may take a little while to load.

To use mybinder, click the link below that will direct you to the workshop (1, 2, or 3) that you want.


Anaconda

First, if you haven't already, install Anaconda on your computer. Then, from your terminal, run

conda install --file spec-file.txt

while cd'd into this directory. Alternatively, you can create a standalone Anaconda environment with all of the dependencies you need by running

conda create --name workshop --file spec-file.txt

This creates an Anaconda environment called workshop, which you can then activate by running conda activate workshop.


Using pip

Install Jupyter notebook with pip3 install jupyter at the terminal, and then run

pip3 install -r requirements.txt

Alternatively, if you're a fan of Python virtual environments, you can create a virtual environment for these notebooks and install the requirements from requirements.txt there. Then, with the environment activated, you can install the virtual environment as a kernel for Jupyter by running the following snippet (credit: tschundler on StackOverflow):

pip install ipykernel
python -m ipykernel install --user --name workshop-kernel

With Docker

If you have Docker installed on your computer, you can pull a container image with Jupyter, this workshop's notebooks, and all of their dependencies:

docker pull -p 8888:8888 wshand/python-data-science-workshop

Execute docker run -it wshand/python-data-science-workshop to start the notebook. The printout to your terminal will provide you with a token with which you can log into the Jupyter server at 127.0.0.1:8888.


CU Boulder JupyterHub server

Start by setting up a CU research computing account. Once that's complete, follow the research computing center's instructions for using JupyterHub to run a Jupyter Notebook on CU's network.

About

Notebooks and code for the LISA Python Data Science Workshop at CU Boulder.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published