Skip to content
This repository has been archived by the owner on Jan 4, 2022. It is now read-only.

ogdch/ckanext-snl

Repository files navigation

ckanext-snl

Harvester for the Swiss National Library (SNL)

Installation

Use pip to install this plugin. This example installs it in /home/www-data

source /home/www-data/pyenv/bin/activate
pip install -e git+https://github.com/ogdch/ckanext-snl.git#egg=ckanext-snl --src /home/www-data
cd /home/www-data/ckanext-snl
pip install -r pip-requirements.txt
python setup.py develop

Make sure to add snl and snl_harvester to ckan.plugins in your config file.

For development

  • install the pre-commit.sh script as a pre-commit hook in your local repositories: ** ln -s ../../pre-commit.sh .git/hooks/pre-commit

Run harvester

source /home/www-data/pyenv/bin/activate
paster --plugin=ckanext-snl snl_harvester gather_consumer -c development.ini &
paster --plugin=ckanext-snl snl_harvester fetch_consumer -c development.ini &
paster --plugin=ckanext-snl snl_harvester run -c development.ini

Only harvest files via OAI-PMH:

source /home/www-data/pyenv/bin/activate
cd /home/www-data/pyenv/src/ckan
 
# Export the oai entries for the specified set
# This command harvests the whole dataset and uploads the resulting records.xml to S3
 paster --plugin=ckanext-snl snl export e-diss -c production.ini
  
# Resume export of the oai entries for the specified set
# This command resumes the harvesting of the "sb" set, beginning from record 106500 and it stops at record 1000000
# If you specify an upper limit the files are not uploaded to S3, but are only kept locally.
paster --plugin=ckanext-snl snl resume sb 106500 1000000 -c production.ini