Skip to content
This repository has been archived by the owner on Nov 17, 2021. It is now read-only.

dnanexus-archive/viral-ngs

 
 

Repository files navigation

dnanexus/viral-ngs

The pipelines are available for use on DNAnexus. See the wiki for instructions on how to run them. The following information is for developers interested in peeking under the hood or modifying them.

Here you'll find the source code for applets implementing individual pipeline stages, and python scripts in the root directory to build the applets and instantiate DNAnexus workflows using them. You'll need the DNAnexus SDK installed and set up to run these scripts.

Continuous integration

Build Status

Travis CI tests is automatically triggered upon changes to the repo. The .travis.yml uses a secure environment variable to encode a DNAnexus auth token providing access to the bi-viral-ngs CI project.

Travis builds the applets & workflows and then executes the workflows on small test datasets (by executing build_workflows.py --run-tests). Supporting materials for these tests are stored in the bi-viral-ngs CI project, which is public on DNAnexus.

Resources tarball

To minimize wheel reinvention, most of the applets directly use tools and wrapper scripts maintained in the existing Broad codebase packaged in an ACI exported from Docker Hub.

A tarball containing the ACI is built by the viral-ngs-builder applet in the DNAnexus execution environment. The build_resources_tarball.py helper script runs this applet and deposits the resources tarball in the bi-viral-ngs CI:/resources_tarball folder. The file ID of the built tarball can be provided to the workflow builder scripts (or provided directly as defaults in the applets' dxapp.json file, see below).

To incorporate a new image from Docker Hub:

  1. ./build_resources_tarball.py [:TAG|@DIGEST] to launch viral-ngs-builder in the bi-viral-ngs CI DNAnexus project.

  2. Upon successful completion of the viral-ngs-builder job, a new resource is generated and stored in the bi-viral-ngs CI:/resources_tarball folder. Take note of its file ID.

  3. On the dnanexus branch (or wip branches from it), find the resources input in viral-ngs-human-depletion/dxapp.json, viral-ngs-fasta-fetcher/dxapp.json, viral-ngs-demux-wrapper/dxapp.json and viral-ngs-taxonomic-profiling/dxapp.json and change their default to the new tarball's file ID. (All the other workflow stages in the assembly workflow take the cue from default setting in viral-ngs-human-depletion.) find . -name dxapp.json | xargs -i sed -i s/OLD_FILE_ID/NEW_FILE_ID/ {}

  4. Ensure that updating the resource tarball did not break things by checking the Travis CI results of the updated branch.

Software licensing issues

The workflows use two semi-proprietary software packages: GATK and Novoalign. Published workflow versions require the user to upload GATK tarball and Novoalign license and provide them as workflow inputs.

Because the bi-viral-ngs CI project is public, the Travis workflow tests use copies of these staged in a separate, private project, which the auth token is also empowered to use.

Packages

No packages published

Languages

  • Python 50.1%
  • Shell 48.8%
  • R 1.1%