Skip to content

CLARIAH/tool-discovery

Repository files navigation

GitHub build Project Status: Active -- The project has reached a stable, usable state and is being actively developed.

CLARIAH Tool Discovery

This repository contains everything related to tool discovery and software metadata in CLARIAH.

  • Dockerfile: The docker container for the CLARIAH Tool Discovery pipeline, including both the harvester and the server and API powering the CLARIAH Tool Store.
  • source-registry/: The tool source registry, contains the source repositories locations and service endpoints for all CLARIAH tools. This is open for contributions
  • etc/, static/: supporting files for the deployment at
  • legacy/cmdi/: Contains legacy CMDI metadata as gathered in WP3 task MD4T at Utrecht University

Service

The tool discovery service, consisting of a harvester that runs on regular intervals (each night) and a tool store, is deployed at https://tools.clariah.nl (production, may not be available yet at this time!) and https://tools.dev.clariah.nl (development).

All harvested data is also available as individual files via https://tools.dev.clariah.nl/files/

Links

Usage

For CLARIAH (local development):

docker build -t clariah-tool-discovery .
docker run -itd -p 8080:80 --env-file=local-dev.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped clariah-tool-discovery 

We recommend you to also pass an extra --env GITHUB_TOKEN=.......... or you will likely hit GitHub's API rate limit during harvestinh. Similarly you can pass a ZENODO_ACCESS_TOKEN

More generic:

docker build -t codemeta-server-tool --build-arg nginx_pass=some_password .
docker run -itd -p 80:80 --env-file=my-env.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped codemeta-server-tool 

To use local yamls for sources harvesting (rather than a remote git repo); add to run -v $PWD/source-registry/:/usr/src/source-registry/source-registry/ and set LOCAL_SOURCE_REGISTRY=true in my-env.env.

Event-based collection, i.e. allowing clients to pushing codemeta files, can be enabled by setting --env-arg UPLOADER=true, you can then POST your codemeta.json file with curl -u <nginx-user> -XPOST -H "Content-Type: application/json" -dcodemeta.json -u user <url>/rest/

For private git repo add to docker run -e GIT_USER='youruser' -e GIT_PASSWORD='yourtoken' To clean up remove the volume codemeta_volume

Integration: API usage instructions

If you want to query the Tool Store from other software, please read this document for instructions on how to use our SPARQL endpoint.