Skip to content

An intro workshop for getting MITx data from Google BigQuery, and performing basic data analysis with Pandas and visualization with Plotly

Notifications You must be signed in to change notification settings

royanin/dll_workshop_2019

Repository files navigation

DLL Workshop April 2019

This jupyter notebook has examples of a number of Pandas commands that I use often to process/prepare the MITx datasets for further analysis.

For visualization, I use Plotly, and you'll see some basic examples of them here.

The de-identified sample data is in the dropbox folder, if you have access to it.

The pre_workshop_instructions.pdf has instructions on how to get data from Google BigQuery, and setting up python environment in your computer, and links to other related repos and resources.

Downloading data from BigQuery

  1. You need a service account file (xxx.json) that you put in the same directory as the get_bq_data.ipynb notebook.
  2. Run the notebook. Adjust the directory parameters to tune where you want to download stuff.
  3. For large-scale downloading of files from GBQ to GCS, you may want to use the gsutil command-line feature (details inside the notebook).

About

An intro workshop for getting MITx data from Google BigQuery, and performing basic data analysis with Pandas and visualization with Plotly

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published