Navigation Menu

Skip to content

NLS-Digital-Scholarship/collections-as-data

Repository files navigation

collections-as-data

Data analysis of NLS collections as data, including collections of metadata and digitised text. Visit the National Library of Scotland's Data Foundry > Tools > Jupyter Notebooks to learn more about the contents of each Notebook and the data it explores.

To explore the collections listed below in live Jupyter Notebooks, open them here in Binder (please note this may take several minutes to load).

To copy and work with the Jupyter Notebooks on your own machine, see Setup.

Table of Contents

Overview

About

This repository explores datasets from the National Library of Scotland's Data Foundry using Jupyter Notebooks. The Notebooks provide introductions to NLS collection data suitable for those with or without experience programming. For an introduction to using Jupyter Notebooks, please refer to Tim Sherratt's GLAM Workbench.

References

While creating the Jupyter Notebooks in this repository, the author referenced material from the sources below. The author sought to use the material for learning only, applying code taught in the material to NLS collections and inform the explanations of the code, but not copy the code or code explanations. If you feel the citations below (and in each Noteook) do not provide adequate attribution for your work, please contact digital.scholarship@nls.uk.

  • Alex, Beatrice and Llewellyn, Clare. (2020) Library Carpentry: Text & Data Mining. Centre for Data, Culture & Society, University of Edinburgh. http://librarycarpentry.org/lc-tdm/.
  • Bird, Steven and Klein, Ewan and Loper, Edward. (2019) Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit. O'Reilly Media. 978-0-596-51649-9. https://www.nltk.org/book/.
  • Python Software Foundation. (2020) xml.etree.ElementTree — The ElementTree XML API. Python 3.8.6 Documentation, The Python Standard Library, Structured Markup Processing Tools. https://docs.python.org/3/library/xml.etree.elementtree.html.

Datasets

Collection 1: Britain and UK Handbooks

Explore this collection in a static or interactive Jupyter Notebook here

Collection 2: A Medical History of British India

Explore this collection in a static or interactive Jupyter Notebook here

Collection 3: Edinburgh Ladies' Debating Society

Explore this collection in a static or interactive Jupyter Notebook here.

Collection 4: Lewis Grassic Gibbon First Editions

Explore this collection in a static or interactive Jupyter Notebook here.

Collection 5: The National Bibliography of Scotland (version 1)

Explore this collection in a static or interactive Jupyter Notebook here.

  • Owner: National Library of Scotland
  • Creator: National Library of Scotland
  • Website: Visit the NLS Data Foundry
  • Date created: 30/08/2019
  • Date updated: 06/09/2019
  • Licence: Creative Commons Attribution 4.0 International (CC-BY 4.0)

Setup

To clone the Git repository and run the Jupyter Notebooks on your own machine, follow the steps below according to your operating system.

For Mac and Linux:

# Install (if not yet installed) the Python venv (virtual environment) module
python3 -m pip install --user virtualenv

# Create a virtual environment (you can change 'env' to a name of your choice)
python3 -m venv env

# Activate the environment
source env/bin/activate

# Clone the repository
git clone https://github.com/NLS-Digital-Scholarship/collections-as-data.git

# Enter and initialize the repository
cd collections-as-data
git init

# Install the required packages
python3 -m pip install -r requirements.txt

# When you're finished, deactivate the environment (simply activate it once
# more and enter the repository if when you wish to resume your work)
deactivate

For Windows:

# Install (if not yet installed) the Python venv (virtual environment) module
py -m pip install --user virtualenv

# Create a virtual environment (you can change 'env' to a name of your choice)
py -m venv env

# Activate the environment
.\env\Scripts\activate

# Clone the repository
git clone https://github.com/NLS-Digital-Scholarship/collections-as-data.git

# Enter and initialize the repository
cd collections-as-data
git init

# Install the required packages
py -m pip install -r requirements.txt

# When you're finished, deactivate the environment (simply activate it once
# more and enter the repository if when you wish to resume your work)
deactivate

For additional guidance on Python virtual environments with pip, please visit Installing packages using pip and virtual environments.

Suggested Citations

If you find any of these Notebooks useful in your work, we would appreciate your citation of them. Suggested citations for each Notebook are: