Skip to content

ipython notebooks for processing bunraku collection data @cul 🇯🇵 🎎

Notifications You must be signed in to change notification settings

mnyrop/bunraku-ipy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bunraku-ipy

Jupyter notebooks &etc. for processing data from the Barbara Curtis Adachi Bunraku (Japanese Puppet Theater) Collection.

pipeline(s):

online collection data / bunraku-online.ipynb

#f03c15 Cake PHP site powered by Relational MYSQL database
1 MySQL dump to CSVs
2 Import CSVs into IPython as Pandas Dataframes
3 Merge relational data (from CSV jointables) onto Dataframes by type
4 Export Dataframes as JSON records (and CSVs, for archival purposes only).
5 Drop null key:value pairs from JSON (bash JQ)
6 Convert (no nulls) JSON to YAML (bash Pyyaml)
7 Generate Jekyll collections (and pages) from YAML using Yaml-Splitter plugin
#c5f015 Static Jekyll site powered by YAML data, with JSON index for static search

total collection data / bunraku-full.ipynb

The data accessible on the original PHP site (as well as the new Jekyll site) represents only about 60% or so of the information stored in the MySQL database. To preserve that information for future use, I used a separate Ipy notebook/pipeline to output CSVs and JSON where images/media marked 'offline' were not dropped.

stats:

There is also a Jupyter notebook for generating matplotlib graphs and D3-specific/refactored JSON for data visualization. (bunraku-stats.ipynb)