This repository contains materials for a 55-minute workshop at the 2019 DLF Forum in Tampa, FL.
This is meant to be a a fun and beginner-friendly introduction to a few useful Python tools, in the context of exploring and manipulating tabular metadata for digital collections. We will focus on a few basic functions of Python's pandas data analysis library for exploring, filtering, reshaping, and merging datasets.
Notebook file location in this repository: notebook_exercises/py_workshop_notebook.ipynb.
The notebook contains all exercises and code for the workshop.
Binder location for notebook from github:
Note that Binder can be slow to load -- 2-8 minutes for the notebook to load after opening this URL could happen!
Alternatively, if you have the Jupyter Notebook software installed on your own computer, you can also download the repository and run the notebook locally.
We will walk through three brief examples with digital collections metadata files. After each example there will be an exercise you can try out on your own.
activity | time |
---|---|
Intro/how to use jupyter notebook | 10 mins |
Example 1: Explore a dataset | 12 mins |
Example 2: Compare a group of metadata files | 12 mins |
Example 3: Merge info from separate files | 12 mins |
Wrap-up: review resources, options for installing/running python | 5 mins |
- nbviewer.jupyter.org provides a viewer that renders notebook code and markdown as static HTML page.
- The static version of the notebook shows all code executed, so you can read the code and see the outputs.
View this Jupyter Notebook in nbviewer: py_workshop_notebook
nbviewer's FAQs page for more information about nbviewer.jupyter.org
This repository also contains static versions of the notebook:
- notebook_markdown/py_workshop.md : best option for viewing rendered notebook within repository
- notebook_exercises/py_workshop.ipynb : notebook formatting rendering is sometimes weird in gitlab, and often fails in github, but this might also work for direct viewing