Skip to content

eanbit-rt/Workflows_and_package_management

Repository files navigation

Workflows and Package Management

Reproducibility and package management techniques: workflow languages (CWL, Snakemake, and Conda). This course introduces some of the approaches for package management and how to create reproducible workflows or pipelines.

Competencies

This session seeks to impart the following competencies:

  1. Knowledge and skills: Bioinformatics tools and their usage.
  2. Knowledge and Skills: Command line and scripting based computing skills appropriate to the discipline.

Learning Outcomes

By the end of this session, and the projects that follow, the learner should be able to:

  1. Select the best workflow and package managers based on the task at hand
  2. Implement a genomic pipeline in at least one workflow manager
  3. Set up a reproducible analysis environment

Outline

  • Introduce the high-level concept of workflows and high throughput data analysis
  • Hands-on activities for setting up the packages
  • Introduce package management and how we can use conda to increase reproducibility with workflows
  • Introduce the theory of workflows: with emphasis on one language (say, snakemake)
  • Hands-on activities of developing workflows

Slides

  1. Using Bioconda to streamline software installation for bioinformatics
  2. Workflows and Pipelines
  3. Package mgmt| resource mgmt | reproducibility

Tutorials

  1. Package Management with conda
  2. Workflow with Snakemake will provide a quick introduction then we'll dive deeper using Reproducible Research tutorial.See this tutorial also
  3. Nextflow and Singularity tutorial
  4. Docker Tutorial
  5. Common Workflow language tutorial. We will not cover this, but we provide links to useful tutorials for you to explore and learn further. Also see this and this(https://andrewjesaitis.com/2017/02/common-workflow-language---a-tutorial-on-making-bioinformatics-repeatable/) walkthroughs.
  6. Resource management on HPC

Reading resources

Some resources and articles you can make use in this course:

  1. Awesome pipelines: A curated list of pipelines and workflow languages

  2. Existing Workflow systems: Computational Data Analysis Workflow Systems

  3. Papers:

About

Reproducibility and package management : workflow languages (CWL, Snakemake, Conda).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages