Skip to content

Applies Principal Component Analysis (PCA) to dimensionality reduction using Python, SQL, and GBQ.

Notifications You must be signed in to change notification settings

Xin-Bu/Dram_shop_PCA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

The Dram Shop Principal Component Analysis (PCA)

The-Dram-Shop

Image source

Introduction

This is a project from my Applied Data Analytics course, a core course from my Master of Science in Business Analytics (MSBA) Program at the University of Montana. The purpose of this project is to better understand the Dram Shop customers' purchasing behavior by applying PCA to dimensionality reduction.

The Dram Shop is a local craft brewing company located at Missoula, MT, with its two locations: the Dram Shop Downtown and the Dram Shop Central. It has more than three dozen regional craft beer and wine offerings and is the first of its kind in the state.

The procedures

  • Write a SQL query in Google Big Query (GBQ) that returns the values of beverage with the top 1000 gross sales totals.
  • Write a SQL query built off the previous one that returns three columns: customer ID, beverage and total sales.
  • Use the pandas function pivot, pivot the data into a 'wide' format with customer ID as the rows and beverage as the items.
  • Use the PCA function from sklearn.decomposition, and perform a PCA on this dataset.
  • Plot the amount of explained variation in the first 20 components.
  • For each of the first 6 components, print the items with the 15 largest loadings in absolute value.

Data source

The Dram Shop

Reference

The Missoula Underground

About

Applies Principal Component Analysis (PCA) to dimensionality reduction using Python, SQL, and GBQ.

Topics

Resources

Stars

Watchers

Forks