Skip to content

Execution Backend

Sam Minot edited this page Nov 12, 2020 · 1 revision

When running workflows (like geneshot) using Nextflow, you have multiple options for what execution backend to use. This describes how the commands described in the workflow are actually carried out.

When run without any additional options, Nextflow just tries to execute everything on your computer, looking for all of the required dependencies in your local operating system. This is not ideal for geneshot based on a couple of reasons:

  • We want to be able to use the correct version of each software package
  • We want to be able to use high-performance computing resources (HPC or cloud computing)

Instead, we recommend one of three different modes for running geneshot: locally with Docker, on an HPC with Singularity, or on a cloud platform using Docker.

To run on your local computer, but making sure to use Docker to satisfy your dependencies, add the following flag to your nextflow run command: -with-docker ubuntu:20.04. See this documentation for more details.

To run on an HPC, but making sure to use Singularity to satisfy your dependencies, configure your Nextflow environment (by constructing a personalized nextflow.config file) according to your HPC scheduler using these instructions, and then make sure to add the additional specifications for using Singularity.

Finally, to run on a cloud provider you must configure Nextflow to use your account (either for AWS Batch or the Google Cloud), and then reference that configuration file when you run geneshot. This will automatically reference the Docker containers which are specified by geneshot for each step of the analysis.

Nextflow provides a powerful set of tools to run compute on any set of resources that you may have available. The only requirement for running geneshot is that you must make sure to use the containers which are specified by geneshot, either with Docker or Singularity. That provides the greatest amount of flexibility while also ensuring that the analytical results will be consistent across computing platforms.