-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Case: Describe/include software containers #39
Comments
Example descriptions generated by extract-dockerfile From a {
"@context": "http://www.schema.org",
"@type": "ContainerRecipe",
"name": "vsoch/salad",
"description": "A Dockerfile build recipe",
"containerImage": "gliderlabs/alpine:3.4",
"labels": [
[
"MAINTAINER toasterlint \"henry@toasterlint.com"
]
],
"environment": [
"RPCPORT=4000"
],
"entrypoint": [
"/entrypoint"
],
} (see openschemas/specifications#10) From a Docker image we describe a ContainerImage: {
"environment": [
"SRC_DIR=/go/src/github.com/vsoch/salad/"
],
"entrypoint": [
"/code/salad"
],
"description": "A Dockerfile build recipe",
"name": "vanessa/sregistry",
"ContainerImage": "iron/go:dev",
"operatingSystem": "linux",
"softwareVersion": "sha256:8d1e7f244db9e7cb85d5867bb3230f756460900e5801ff2303e44a79369640f4",
"identifier": [
"vanessa/sregistry:latest"
],
"url": "https://hub.docker.com/r/vanessa/sregistry",
"alternateName": "Singularity Registry",
"softwareHelp": "https://singularityhub.github.io/sregistry",
"citation": "http://joss.theoj.org/papers/050362b7e7691d2a5d0ebed8251bc01e",
"license": "https://github.com/singularityhub/sregistry/blob/master/LICENSE",
"keywords": "container, containers, singularity, singularity registry",
"softwareRequirements": [
"Pip > xmlsec==1.3.3"
],
"@context": "http://www.schema.org",
"@type": "ImageDefinition"
} Above extract-dockerfile has actually extracted the (however this type is called ContainerImage rather than |
See discussion in openbases/extract-dockerfile#6 - there was some discussion over the name, my preference is for what is represented in https://openschemas.github.io/specifications/ because (as you correctly bring up) an ImageDefinition could refer to other kinds of images, but ContainerImage is more clear. |
This is interesting! Would this need to be related to cwl as well? (which defines how to invoke the image as opposed to the definition of the image itself) In Dockerpedia they have done a thorough extraction of images, although it's not aligned with schema. Maybe we can use their service for extraction too. An example: https://dockerpedia.inf.utfsm.cl/resource/SoftwareImage/dockerpedia-pegasus_workflow_images_latest |
I don't think it would be wise to "hard code" (so to speak) any particular workflow manager or description (e.g., cwl, snakemake, nextflow) directly into the specification. On the other hand, if there is an appropriate field to describe this same entity, it would be logical to include (e.g., if I find that it's snakemake, I should look for a Snakefile somewhere...) For CWL, is there a definitive specification for interaction? For example, for a scif container, you can be absolutely sure how to discover applications inside (singularity run container.sif apps) and then how to run / inspect / shell / otherwise interact with an application you just found (e.g., |
CWL has a field for pulling from a docker container. Maybe that could be the hook. |
Yes, understood! To be more clear, there are many different tools that describe in a structured way how a container (or app inside) is supposed to be invoked. Actually, those two things are different - cwl could describe an app in a container (and it would have to be provided via the entrypoint so the user could run it to find it) while SCIF describes how to invoke the container itself (of which cwl could be one or more entrypoints). But from how you describe it - that there is a field for pulling the container, this sounds like it would need to be stored outside of the container, which is another point to discuss. SCIF is a specification that describes standard interaction with a container, and is installed inside the container, along with the SCIF filesystem and other metadata files that are defined for each app. |
This is a necessary use case for Whole Tale. A few questions:
|
Having a repo2docker configuration is an interesting and useful idea, but I think it would be done in addition to a container recipe - repo2docker in and of iteself doesn't translate to reproducibility - it just means that (assuming a version of repo2docker is available) you could build a container for it. You can think of it like an extra layer to essentially create a Dockerfile (that could be built). It also assumes a user "joyvan" that when converted to Singularity (e.g., for use on HPC) makes things a bit challenging because of the cardinal rule "the user inside the container is the user outside the container." Re-reading what @stain mentioned - it sounds like he wants the full container, in which case Docker wouldn't be as feasible as it means layers that need to be assembled and require the Docker daemon. A Singularity (sif) binary would be more reasonable, albeit large, and still require Singularity to run. It's really the case that any level of recipe without the container runs the risk of not being able to be built, so probably providing the container somewhere is needed. In the case of Singularity, the recipe file is kept inside the container as well. In the case of Docker, the recipe (and other metadata) would serve as an external way to peep inside without invoking the container. I'm not super familiar with RO-crates, but reading the description:
it does sound like a wrapper (with metadata) to a container is wanted? The container, considered as some kind of data, could also fit into the specification, and as @stain showed, metadata could be extracted for the jsonld. |
Indeed, you can generate with I also agree the container recipe is worth to be saved (or referenced plus a fingerprint), as the base image of the recipe could contain a bug, and you would like to re-create it. |
As an open science researcher, I want to provide Docker/Singularity container images so that others can reliably reproduce my results or reuse the same software.
This implies that the container images and their recipes (e.g.
Dockerfile
) should be included in the RO-Crate and typed as such, so users know they can be executed.It is desirable also to use tooling to expand the description with a list of dependencies installed in the container this will help provide light-weight software citations.
Related efforts to align with:
The text was updated successfully, but these errors were encountered: