Skip to content

How to make your software tool available through PhenoMeNal

jbradbury edited this page Jul 26, 2018 · 31 revisions

Quick-start guide for developers

If you are familiar with docker and other technologies used within our project, you can simply follow the next steps:

  • Create a dockerfile, with a repo inside phnmnl github organization named container-<your-tool>, for your tool.
  • Once your docker image builds locally, add it as a job in Jenkins, using naming pattern container-<your-tool>. You should have authentication details for this already.
  • Create a data-related test for your container in Jenkins. The job in Jenkins should be named test-container-<your-tool>, following this guide.
  • Write a Galaxy tool wrapper. An example and explanation on how to write a wrapper can be found on the Galaxy Wiki.
    • Besides this, for PhenoMeNal, we need to add entries for the tool on the config/job_conf.xml and on config/tool_conf.xml, as explained here
  • Test your Galaxy tool wrapper plus container on a local deployment of Kubernetes using minikube. If you already have minikube and helm running, you can skip to these directions for deploying locally against your development directory.
  • Once everything works on your own local Kubernetes, make a pull request on the galaxy-k8s-runner git repo for adding it to the main instance.
    • This PR should include the actual XML galaxy wrapper (including any accompanying files for its operation/documentation), the changes needed on config/job_conf.xml and on config/tool_conf.xml.
  • If you also have a Galaxy fancy workflow around your tool(s):
  • As we approach release dates, due to changing functionality or other reasons, you will need to make a release of your container. The individual tool/container release process is explained here.

How to make your software tool available through PhenoMeNal

PhenoMeNal is a comprehensive and standardised e-infrastructure that will support the data processing and analysis pipelines for molecular phenotype data generated by metabolomics applications. In order to provide services enabling computation and analysis to improve the understanding of the causes and mechanisms underlying health, healthy ageing and diseases, PhenoMeNal uses existing open source data standards and tools to make them available on local, national and international cloud infrastructures.

PhenoMeNal encourages bioinformaticians and software developers to submit and prepare their applications and tools to be available through our e-infrastructure. The primary place of development and contribution is the PhenoMeNal Github Repository. Here, we compile most of our technical efforts to be included in the e-infrastructure. We encourage bioinformaticians and software developers and gladly assist with preparing applications to be available through PhenoMeNal.

Within PhenoMeNal, software will be packaged into containers. We use a technology called docker for packaging applications and all of their dependencies. Docker is available for several platforms including Linux, FreeBSD, Windows and Apple macOS. Installation instructions can be found at Docker. The docker container images can be deployed easily throughout the e-infrastructure and are also available for anybody to be pulled from the PhenoMeNal Docker Registry (only available through the docker pull command). For example, to obtain our Mass IPO container to use locally on a machine with docker installed, you would need to execute:

docker pull container-registry.phenomenal-h2020.eu/phnmnl/ipo

which would download for you the container.

Further reading on docker can be obtained here.

Installation of docker is easy. Unexperienced users just need to follow the instructions of the graphical installer supplied by Docker. While recent Linux distributions come with Docker packages, some older Debian-based Linux distributions require manual intervention to properly install Docker. On Ubuntu Trusty (14.05), Docker can be installed with the following commands:

sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
sudo echo "deb https://apt.dockerproject.org/repo ubuntu-trusty main" > /etc/apt/sources.list.d/docker.list
sudo apt-get remove docker.io
sudo apt-get update
sudo apt-get install docker-engine

Please note that the kernel needs some additional starting parameters to run docker containers smoothly:

sudo echo 'GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"' >> /etc/default/grub
sudo update-grub

Within PhenoMeNal we also use docker-compose – a docker component that automates the management of docker containers. While with Docker Toolbox it is automatically installed, docker-compose can be installed on any distribution using pip (the python package manager):

sudo apt-get install python-pip
sudo pip install docker-compose

Contributing to PhenoMeNal

PhenoMeNal aims to isolate bioinformatics tools as micro services through the use of Docker containers. If you haven't used Docker containers in the past, the official Docker guide is an excellent starting point.

Building Docker containers requires the use of so called Dockerfiles. They contain a list of commands needed for installing the application with all of its dependencies. One great feature of Docker is that new images can build upon other pre-built Docker images. Thus, usually only a few rules are needed in a Dockerfile to get a given application to build and configure successfully. An example Dockerfile could look like this:

FROM ubuntu:14.04
MAINTAINER Payam Emami, payams@his.email.domain
RUN apt-get update && apt-get install --yes openms
ENTRYPOINT ["FeatureLinkerLabeled"]

Dockerfiles are plain-text files that contain simple rules what kind of commands are needed to build a container image. The tag 'FROM' specifies the pre-built container image the Dockerfile is based on. Everything that is written after 'RUN' is treated as a command. Finally, the 'ENTRYPOINT' tag specifies which command is being started when running the container. A very simple Docker container that provides an R-script can be realised with just 4 lines of code. In our demo, we want to run a script called myscript.R that calculates pi in the directory in which the Dockerfile is located in. We are using the official r-base Docker container image, thus avoiding building R ourselves.

Dockerfile:

FROM r-base:latest
ADD myScript.R /usr/local/bin/myScript.R
RUN chmod a+x /usr/local/bin/myScript.R

ENTRYPOINT ["myscript.R"]

myscript.R:

#!/usr/bin/env Rscript
x <- 0.5
pi <- 2 * (asin(sqrt(1 - x^2)) + abs(asin(x)))
print(pi)

Please refer the Dockerfile reference for instructions on which commands can be used additionally. You can also refer the Dockerfiles in the official PhenoMeNal github repository which can serve as examples. You may also want to keep an eye on the type of metadata needed to describe the software, to make it citable and discoverable (also outside PhenoMeNal), see codemeta.github.io for further details.

If the R script or Python script is more complex and depends on other files, a better way is to make an R package (doesn’t need to be added to CRAN, but just accessible at a public Git repo) or Python pip installable (doesn’t need to be added to pip, again only accessible at a public Git repo). This is to avoid things like embedded 'source(my_other_script.R)' calls, which break when the working directory is not the one set by the container (this can be very easily overridden and it is the case in some use cases). A quick alternative workaround for this is to use absolute paths for those sources, so that they don’t depend on the working directory.

After having written the Dockerfile, you just need to build the docker container image:

docker build -t simple .

After having built the container image successfully, you can run the container:

docker run --name=simple-run -it simple

Here is another simple example from one of our workshops we did within PhenoMeNal: https://github.com/pierrickrogermele/uppsala-pierrick-log2trans

Once you have gained some experience with writing Dockerfiles – the main way to define a Docker image – you should consider reading up on the best practices to improve your Dockerfiles and, consequently, your Docker images. The official Docker Best Practices is an excellent resource to read.

In order to make your ‘containerised’ application available in PhenoMenal, you should add your Dockerfile with the required accompanying files (as few as possible, try to fetch files from a github repository instead) to our PhenoMeNal github repository. You can also ask someone who can to fork it to the PhenoMeNal github repository and register it with the PhenoMeNal Jenkins Continuous Integration Server. As a result, your docker container image should be available through our docker registry “automagically”.

Clone this wiki locally