Add stardist 0.8.5 recipe to containers #551

FloWuenne · 2023-11-29T15:32:42Z

Submitting a Container

(If you're requesting for a new container, please check the procedure described here.

Check BioContainers' Dockerfile specifications

Checklist

Misc

My tool doesn't exist in BioConda
The image can be built

Metadata

Extra (optionals)

Check BioContainers' Dockerfile metadata

biocontainers-bot · 2023-11-29T15:37:14Z

No biotools label defined, please check if tool is not already defined in biotools (https://bio.tools) and add extra.identifiers.biotools label if it exists. If it is not defined, you can ignore this comment.

biocontainers-bot · 2023-11-29T15:37:15Z

No test-cmds.txt (test file) present, skipping tests

mboudet · 2023-11-29T15:51:10Z

stardist/0.8.5/Dockerfile

+
+ARG DEBIAN_FRONTEND="noninteractive"
+ARG STARDIST_VERSION="0.8.5"
+ARG NVIDIA_DRIVER_VERSION=470


I feel like this will be an issue? According to the stardist github, the NVIDIA_DRIVER_VERSION should be adjusted to the runtime environnement (depending on the GPU you have, I guess).

No sure how the container will behave if the runtime env does not match

Thanks for the review @mboudet !

What would you suggest how to handle this issue?

No quite sure, to be honest. It would be possible to install the correct lib at runtime I guess, but it would be quite dirty, and would probably not work with singularity.

I'm not sure how does it work with the conda package? Would it be possible to directly install the package in the dockerfile (like here), or does it require external librairies ?

I don't think I understand what you mean whether it would be possible to directly install the package in the dockerfile? Which package are you referring to? I don't see anything related to nvidia drivers in the recipe you linked unforunately? 🤷🏼

Well, you use the 'NVIDIA_DRIVER_VERSION' variable to install a specific version of the libnvidia-compute in the next step. (Line 24). According the the stardist github, the libnvidia-compute version should match the GPU driver version.

So as is custom these days, I asked ChatGPT for help. The suggestion from it was to specify --build-arg when building the docker container at runtime to adjust the ENV variable NVIDIA_DRIVER_VERSION to the one available:
docker build --build-arg NVIDIA_DRIVER_VERSION=<desired_version> -t <your_image_name> .

I am not sure there is a good way to automatically detect this version number. I also don't really know whether we could build a container for cpu completely without NVIDIA drivers...

Yeah, this would work, but that mean that the user would need to build the image themselve (so, not much point in having the image on biocontainers). Also, building an image usually require root/sudo privileges, which are not usually available in computing clusters.

My question regarding the conda package stands though. I'm not sure if a mamba/conda install in the container would work. This might need input from the software's developpers.

I will ask in their repo if they can advise regarding the mamba / conda install. What would be the advantage over mamba/conda install over the pip install that we are currently doing?

Totally agree on the container build comment. One of the main goals here is to be able to build a singularity container from the docker container and if we can't do that, you are right, pretty useless to put the docker container on biocontainers 🙃

I asked them in stardist/stardist#259

uschmidt83 · 2023-12-03T23:23:30Z

Hi, StarDist author here. I'm not really familiar with Bioconda+BioContainers, hence I'm wondering what the StarDist container is supposed to be used for in the Bioconda+BioContainers ecosystem.

The context for the Dockerfile in our repo (which the Dockerfile here is based on) was to offer an easy way to get a GPU-accelerated version of StarDist running, without the hassle of installing CUDA etc. Using StarDist is meant to be primarily done via interactive Jupyter notebooks (hence the TensorFlow jupyter-variant base image).

StarDist itself doesn't require GPU-acceleration, but can take advantage of it for improved performance (by enabling GPU-support for TensorFlow via CUDA, and also by installing gputools to take advantage of OpenCL).

FloWuenne · 2023-12-04T11:36:36Z

Thanks for that insight @uschmidt83 !

The main purpose of having a stardist container in biocontainers (or any hosted hub for docker for that matter) from our side is that we want to implement it in nf-core based nextflow pipelines. These pipelines require Docker containers for reproducible execution. Since stardist is part of conda-forge, a bioconda package (which automatically builds biocontainers) obviously can't be created and so a biocontainers recipe was the next logical step from my side. The alternative would be a CI logic on your repo that pushes to a publicly available dockerhub.

For starting I used your github repo Dockerfile and adjusted it for the biocontainer requirements. To be fair, I am not that faimiliar with GPU specifications in container settings and didn't realize, that the NVIDIA driver specifications could be issues for cross-platform and singularity compatibility until @mboudet mentioned it.

Would you be able and willing to make suggestions to this recipe to remove GPU acceleration and requirements? Again, the main drive of this is to have a docker container hosted in the public domain to use in nf-core pipelines. If you prefer to have a CI on your repo push to your dockerhub and we can use that, then we can also close this PR, whatever option you like better 😊!

Thanks for your help and insights!

uschmidt83 · 2023-12-04T21:33:44Z

The main purpose of having a stardist container in biocontainers (or any hosted hub for docker for that matter) from our side is that we want to implement it in nf-core based nextflow pipelines. These pipelines require Docker containers for reproducible execution.

If I understand this correctly, these Nextflow pipelines are not meant to run interactively. Instead each "step" of the pipeline consists of software that can either be installed via conda or as run as a container.

If that assumption is true, wouldn't the StarDist container then not need a defined entrypoint to run the the software with defined inputs and outputs?

Since stardist is part of conda-forge, a bioconda package (which automatically builds biocontainers) obviously can't be created and so a biocontainers recipe was the next logical step from my side.

I don't understand this part, and also don't know the relationship between conda-forge and bioconda.

For starting I used your github repo Dockerfile and adjusted it for the biocontainer requirements. To be fair, I am not that faimiliar with GPU specifications in container settings and didn't realize, that the NVIDIA driver specifications could be issues for cross-platform and singularity compatibility until @mboudet mentioned it.

Would you be able and willing to make suggestions to this recipe to remove GPU acceleration and requirements?

Removing the GPU-related things from the Dockerfile is actually easy. Again, I'm more concerned with how the Docker container is supposed to be used in the context of Nextflow.

Again, the main drive of this is to have a docker container hosted in the public domain to use in nf-core pipelines. If you prefer to have a CI on your repo push to your dockerhub and we can use that, then we can also close this PR, whatever option you like better 😊!

I don't have enough information to know which option would be better for us. It looks like the container definition needs to be updated each time we make a new StarDist release. How does that process work in BioContainers?

Best,
Uwe

FloWuenne · 2023-12-05T12:39:31Z

Thank you for the details again @uschmidt83 ! Sorry if my answer was more confusing then anything.
To address your questions:

Your first assumption about nextflow is absolutely right. Nextflow aims to provide reproducible pipelines by running specific tasks from within conda environments or docker/singularity (or other container environments). Usually, tools are run via CLI interfaces from within the containers and thus the container itself doesn't need an entrypoint. Within nextflow pipelines, these CLI interfaces can be given by scripts provided with the nextflow code, rather than necessarily be part of the tool / container. That being said, having a CLI interface for inference using stardist wouldn't be a bad idea in my opinion. migueLib already has some code for doing exactly this. Could maybe be an option to open a PR and add a CLI interface...
The whole part about bioconda vs conda-forge was basically just me saying, that bioconda builds docker containers automatically on quay.io / biocontainers (and conda-forge doesnt). But since stardist is already on conda-forge, one can't create a bioconda recipe (because namespace is shared between the two channels).
Regarding updating the container, the easiest from a maintenance standpoint I guess would be CI on the stardist repo, that automatically builds the container on each new release. I believe biocontainers would automatically make a PR to bump the version (via dependabot) everytime there would be a new pypi release. Ultimately again, I would leave this up to you and could help with the CI setup if required.

Please let me know what you think of these ideas and whether any details are still unclear!

mboudet · 2023-12-05T14:57:10Z

I believe biocontainers would automatically make a PR to bump the version (via dependabot) everytime there would be a new pypi release.

I don't believe there is a dependabot setup on the biocontainer repository. Seems like all PR are manual.
(There is a bioconda-bot setup on the bioconda repository, which makes PR based on release on github).

So it might be easier to setup a CI on stardist side to push on docker/github container hub (and update the conda package) after a release (if that's needed)

FloWuenne · 2023-12-06T08:54:50Z

Thanks @mboudet ! I just saw a PR with dependabot and thought that would also check version bumps of main packages within containers...

In that case, a CI directly on the stardist repo that builds a non-GPU container and pushes to dockerhub might be the easiest @uschmidt83.

uschmidt83 · 2023-12-09T00:52:22Z

Thanks for the clarification, @FloWuenne and @mboudet.

That being said, having a CLI interface for inference using stardist wouldn't be a bad idea in my opinion. migueLib already has some code for doing exactly this. Could maybe be an option to open a PR and add a CLI interface...

We do have CLI for 2D and 3D model inference, but not for model training. Please take a look and let me know if you think this is sufficient.

So it might be easier to setup a CI on stardist side to push on docker/github container hub (and update the conda package) after a release (if that's needed)

Not sure about this, I have to look into this and understand the pros and cons.

FloWuenne · 2023-12-14T13:30:26Z

Thanks for the links for the inference CLIs @uschmidt83 !

The inference scripts look exactly what would be needed to run inference from within a container. So I think the easiest would be if we would update the Dockerfile in your main stardist repository to be one that doesn't use GPU (if that's something you want to support. The existing dockerfile could be saved as a DOCKERFILE_GPU or similar).

Once the main DOCKERFILE in the repo is updated, here is an example of a Github CI that would build the container and push it to Dockerhub on release and every night (you would have to set the Dockerhub name and secret in the repository). Obviously for stardist one could set the CI to only trigger on release or merge on main (depending on your development structure!)

Let us know what your preference is! We can also still go ahead and change this recipe here to work without GPU and have this alongside your main repository. That way, you don't have to change the Dockerfile and deal with setting up the CI... Ultimately I would leave the decision to you as the package creator!

FloWuenne · 2024-03-25T12:49:47Z

@uschmidt83 Just a quick ping here to ask whether you wanted to go through with biocontainer or rather go with Docker CIs in your repo? If you want to go for the CI action then I would close this PR.

FloWuenne added 2 commits November 29, 2023 15:27

Added stardist 0.8.5 recipe.

676e791

Added missing checklist items.

1a5a762

FloWuenne mentioned this pull request Nov 29, 2023

Add stardist to Bioconda stardist/stardist#259

Open

mboudet reviewed Nov 29, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add stardist 0.8.5 recipe to containers #551

Add stardist 0.8.5 recipe to containers #551

FloWuenne commented Nov 29, 2023

biocontainers-bot commented Nov 29, 2023

biocontainers-bot commented Nov 29, 2023

mboudet Nov 29, 2023

FloWuenne Nov 29, 2023

mboudet Nov 29, 2023

FloWuenne Nov 29, 2023

mboudet Nov 29, 2023

FloWuenne Nov 29, 2023

mboudet Nov 29, 2023

FloWuenne Nov 29, 2023

FloWuenne Nov 29, 2023

uschmidt83 commented Dec 3, 2023

FloWuenne commented Dec 4, 2023

uschmidt83 commented Dec 4, 2023

FloWuenne commented Dec 5, 2023

mboudet commented Dec 5, 2023

FloWuenne commented Dec 6, 2023

uschmidt83 commented Dec 9, 2023

FloWuenne commented Dec 14, 2023

FloWuenne commented Mar 25, 2024

Add stardist 0.8.5 recipe to containers #551

Are you sure you want to change the base?

Add stardist 0.8.5 recipe to containers #551

Conversation

FloWuenne commented Nov 29, 2023

Submitting a Container

Check BioContainers' Dockerfile specifications

Checklist

Check BioContainers' Dockerfile metadata

biocontainers-bot commented Nov 29, 2023

biocontainers-bot commented Nov 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uschmidt83 commented Dec 3, 2023

FloWuenne commented Dec 4, 2023

uschmidt83 commented Dec 4, 2023

FloWuenne commented Dec 5, 2023

mboudet commented Dec 5, 2023

FloWuenne commented Dec 6, 2023

uschmidt83 commented Dec 9, 2023

FloWuenne commented Dec 14, 2023

FloWuenne commented Mar 25, 2024