Skip to content

Latest commit

 

History

History
328 lines (268 loc) · 13 KB

recipes.md

File metadata and controls

328 lines (268 loc) · 13 KB

HPC Container Maker Recipes

Recipes are a container implementation independent way to specify the steps to construct a container image. For example, the same HPCCM recipe may be used as the basis for both Docker and Singularity container images.

A HPCCM recipe is Python code. A recipe uses HPCCM building blocks and primitives, as well as other Python code to specify the content of a container image. Since a HPCCM recipe is Python code, it is possible to create dynamic recipes depending on validated user input. A single HPCCM recipe can generate multiple container images.

This simple HPCCM recipe uses the baseimage HPCCM primitive to specify the container base image and the gnu HPCCM building block is install the GNU compiler suite.

Stage0 += baseimage(image='centos:7')
Stage0 += gnu()

Note: Stage0 refers to the first stage of a multi-stage build. Multi-stage builds are a technique that can significantly reduce the size of container images. This section will not use multi-stage builds, so the Stage0 prefix can be considered boilerplate. See the section on multi-stage recipes for more information.

The hpccm command line tool processes recipes and generates the corresponding Dockerfile or Singularity definition file.

$ hpccm --recipe simple.py
FROM centos:7

# GNU compiler
RUN yum install -y \
        gcc \
        gcc-c++ \
        gcc-gfortran && \
    rm -rf /var/cache/yum/*
$ hpccm --recipe simple.py --format singularity
BootStrap: docker
From: centos:7
%post
    . /.singularity.d/env/10-docker*.sh

# GNU compiler
%post
    yum install -y \
        gcc \
        gcc-c++ \
        gcc-gfortran
    rm -rf /var/cache/yum/*

The HPCCM output is the container specification, so save the output to a file. By convention, the container specification files are named Dockerfile or Singularity.def for Docker and Singularity, respectively. To generate a container image, use your preferred container image builder.

Using Docker:

$ hpccm --recipe simple.py --format docker > Dockerfile
$ sudo docker build -t simple -f Dockerfile .

Using Singularity:

$ hpccm --recipe simple.py --format singularity > Singularity.def
$ sudo singularity build simple.sif Singularity.def

Building Blocks

A key feature of HPCCM is its set of building blocks, high-level abstractions of key HPC software components. Building blocks are roughly equivalent to environment modules, except that building blocks are configurable and composable.

HPCCM building blocks are Linux distribution aware. The output of a building block will reflect the Linux distribution of the base image. Ubuntu and RedHat derived distributions (e.g., CentOS) are supported. For example, if the base image is derived from the Ubuntu Linux distribution, the apt package manager is used to install any required packages. However, if the base image is derived from CentOS, the yum package manager would be used instead. The base image Linux distribution detection is automatic and normally requires no action by the user.

Most building blocks also have configuration options to enable customization. For instance, the openmpi building block has options to specify the version, the installation path, the compiler toolchain to use, whether to enable CUDA and InfiniBand support, and so on. Reasonable defaults are set so configuration is usually optional.

Some building blocks may require a license to use. In those cases, HPCCM expects the user to provide a valid license and the license information can be specified via a building block configuration option. By using multi-stage recipes, licensed software can be used to build an application without needing to redistribute the licensed software or the license itself.

Primitives

While the container specification file syntax may differ depending on the container runtime, the same types of operations are performed, e.g., executing shell commands, copying files into the container image, setting the environment, etc. HPCCM primitives are wrappers around these basic operations that translate the operation into the corresponding container specific syntax. All the building blocks are implemented on top of primitives to simplify supporting multiple container specification output formats. Where a building block is available it should be used instead of the primitive equivalent.

Some key primitive operations, and their Dockerfile and Singularity definition file equivalents are shown in the following table. Please refer to the primitives documentation for the complete list of primitives and their configuration options.

Primitive Docker Singularity
baseimage(image='image:tag') FROM image:tag BootStrap: docker
From: image:tag
copy(src='foo', dest='bar') COPY foo bar %files
foo bar
copy(src=['a', 'b', 'c'], dest='z/') COPY a b c z/ %files
a z/
b z/
c z/
shell(commands=['a']) RUN a %post
a
shell(commands=['a', 'b', 'c']) RUN a && b && c %post
a
b
c
environment(variables={'FOO': 'BAR'}) ENV FOO=BAR %environment
export FOO=BAR
%post
export FOO=BAR
environment(variables={'A': 'B', 'C': 'D'}, _export=False) ENV A=B C=D %environment
export A=B
export C=D
workdir(directory='/path/to') WORKDIR /path/to %post
mkdir -p /path/to
cd /path/to
label(metadata={'FOO': 'BAR'}) LABEL FOO=BAR %labels
foo bar

Primitives also hide many of the differences between the Docker and Singularity container image build processes so that behavior is consistent regardless of the output configuration specification format. For example, the Dockerfile ENV instruction sets environment variables immediately, i.e., the value of an environment variable can be used in any subsequent instructions, while the Singularity %environment block sets environment variables only when the container is running. Therefore the environment primitive generates an additional Singularity %post block by default (the behavior can be disabled with a configuration option.)

Templates

Some operations are very common and invoked by multiple building blocks, such as cloning a git repository or executing the configure / make / make install workflow. HPCCM templates abstract these basic operations for consistency and to avoid code duplication.

Templates are primarily intended to be used by building blocks and thus are not exported by default for use in recipes. However, templates can be manually imported and used in recipes, e.g., the MILC recipe.

User Arguments

Using Python to express container specifications is one of the key features of HPCCM. HPCCM recipes can process user input to generate multiple container specification permutations from the same source code. Because of the flexibility of Python, HPCCM user arguments are a much more powerful flavor of the Dockerfile ARG instruction.

The hpccm command line tool has the --userarg option. Values specified using this option are inserted into a Python dictionary named USERARG that can be accessed inside a recipe.

ompi_version = USERARG.get('ompi', '3.1.2')
Stage0 += openmpi(infiniband=False, version=ompi_version)
$ hpccm --recipe userargs.py --userarg ompi=3.0.0

Multi-stage Recipes

Multi-stage builds are a technique that can significantly reduce the size of container images. Multi-stage recipes can also be used with licensed software to build an application without needing to redistribute the licensed development software or source code.

A recipe consists of one or more stages, although many recipes will only contain a single stage. The Stage0 and Stage1 variables are automatically created for use in HPCCM recipes.

Most building blocks provide a runtime method to install the corresponding runtime version of a component in another stage. The Stage class also provides a runtime method that calls the runtime method of every building block. Building block settings defined in the first stage are automatically reflected in the second stage using the runtime method.

Stage0 += baseimage(image='nvidia/cuda:9.0-devel-centos7', _as='devel')
Stage0 += openmpi(infiniband=False, prefix='/opt/openmpi')

Stage1 += baseimage(image='nvidia/cuda:9.0-base-centos7')
Stage1 += Stage0.runtime(_from='devel')
$ hpccm --recipe multi-stage.py
FROM nvidia/cuda:9.0-devel-centos7 AS devel

...

FROM nvidia/cuda:9.0-base-centos7

# OpenMPI
RUN yum install -y \
        hwloc \
        openssh-clients && \
    rm -rf /var/cache/yum/*
COPY --from=devel /opt/openmpi /opt/openmpi
ENV LD_LIBRARY_PATH=/opt/openmpi/lib:$LD_LIBRARY_PATH \
    PATH=/opt/openmpi/bin:$PATH

Singularity version 3.2 and later supports multi-stage Singularity definition files. However, the multi-stage definition file syntax is incompatible with earlier versions of Singularity. Use the HPCCM --singularity-version <version> command line option to specify the Singularity definition file version to generate. A version of 3.2 or later will generate a multi-stage definition file that will only build with Singularity version 3.2 or later. A version less than 3.2 will generate a portable definition file that works with any version of Singularity, but will not support multi-stage builds.

$ hpccm --recipe multi-stage.py --format singularity --singularity-version 3.2
# NOTE: this definition file depends on features only available in
# Singularity 3.2 and later.
BootStrap: docker
From: nvidia/cuda:9.0-devel-centos7
Stage: devel

...

BootStrap: docker
From: nvidia/cuda:9.0-base-centos7

# OpenMPI
%post
    yum install -y \
        hwloc \
        openssh-clients
    rm -rf /var/cache/yum/*
%files from devel
    /opt/openmpi /opt/openmpi
%environment
    export LD_LIBRARY_PATH=/opt/openmpi/lib:$LD_LIBRARY_PATH
    export PATH=/opt/openmpi/bin:$PATH
%post
    export LD_LIBRARY_PATH=/opt/openmpi/lib:$LD_LIBRARY_PATH
    export PATH=/opt/openmpi/bin:$PATH

If Singularity version 3.2 or later is not an option, Docker images can be easily converted to Singularity images so older versions of Singularity can also (indirectly) take advantage of multi-stage builds.

$ sudo docker run -t --rm --privileged -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/output singularityware/docker2singularity <docker-tag>

Scripts Using the HPCCM Module

HPCCM recipes automatically handle some common tasks, such as creating stages, user arguments, and specifying the output container specification format. For those unfamiliar with Python, HPCCM recipes provide a seemingly higher level interface.

It is also possible to write a "native" Python script and import HPCCM as a module. This provides more flexibility, but the user is responsible for managing input and output. A script using HPCCM as a module could implement a more sophisticated user input handling than hpccm --userarg, write to a file instead of standard output, or combine HPCCM with other Python modules.

Building blocks and primitives are implemented using the Python __str__ function, so it is possible to simply call a building block or primitive in string context, e.g., print(). There are additional APIs that are useful for this use case, e.g., to set the configuration specification output format.

#!/usr/bin/env python

from __future__ import print_function

import hpccm

# Set to 'docker' to generate a Dockerfile or set to 'singularity' to
# generate a Singularity definition file
hpccm.config.set_container_format('docker')

print(hpccm.primitives.baseimage(image='centos:7'))
compiler = hpccm.building_blocks.gnu()
print(compiler)

A basic example of using HPCCM as a library is provided.