Skip to content

bradleyboehmke/uc-bana-6043

Repository files navigation

UC BANA 6043 Statistical Computing

By Brad Boehmke 🚀

Welcome to Statistical Computing with Python! This course provides an intensive, hands-on introduction to statistical computing and data science with the Python programming language. You will gain foundational skills in managing data structures, performing data wrangling, computing and visualizing statistical relationships, managing various environments conducive for statistical analysis, and performing machine learning modeling. Most importantly, since this course only has time to introduce foundational skills, much emphasis is placed on giving you a mental model of Python's data science ecosystem so you know how, when, and where to continue advancing your statistical computing capabilities.

Learning Objectives

Upon successfully completing this course, you will:

  • Have a mental model of the Python data science ecosystem: libraries, capabilities, vocabulary, and widely-available Python resources.
  • Have the ability to use Python within both interactive (Jupyter, REPL) and non-interactive (scripts) environments.
  • Be able to perform core data wrangling activities: importing data, reshaping data, transforming data, and exporting data.
  • Be able to compute descriptive statistics and visualize key patterns and relationships with your data.
  • Be exposed to modeling via scikit-learn and discuss the fundamentals of building models in Python.
  • Have the resources and understanding to continue advancing your statistical computing capabilities.

Schedule

Module Description
1 Starting with the Basics
Introduction to JupyterLab and the notebook environment
Python fundamentals
2 Python Data Science Ecosystem & DataFrames
Modules, packages, and a preview of Python's data science ecosystem
Importing data and working with DataFrames
3 Data Wrangling Part 1
Subsetting and manipulating data
Computing summary statistics at different levels
4 Data Wrangling Part 2
Tidying and joining data
Handling text data
5 Data Visualization
Higher and lower level plotting APIs
Interactive visualizations
6 Creating Efficient Code in Python
Control statements & iteration
Writing functions
7 Intro to Machine Learning with Scikit-Learn
Basics of the Scikit-learn API
Feature engineering and model evaluation/selection

Getting Started

The primary course material is provided via this Jupyter Book resource 📕.