Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Wrapper and meta-wrapper for rbt consensus reads #544

Merged
merged 20 commits into from Aug 29, 2022

Conversation

FelixMoelder
Copy link
Contributor

@FelixMoelder FelixMoelder commented Aug 18, 2022

Description

This adds two wrappers.
First a wrapper for rust-bio-tools consensus read calculation and second a meta wrapper which adds best practice post processing steps for consensus read calculation.

QC

  • I confirm that:

For all wrappers added by this PR,

  • there is a test case which covers any introduced changes,
  • input: and output: file paths in the resulting rule can be changed arbitrarily,
  • either the wrapper can only use a single core, or the example rule contains a threads: x statement with x being a reasonable default,
  • rule names in the test case are in snake_case and somehow tell what the rule is about or match the tools purpose or name (e.g., map_reads for a step that maps reads),
  • all environment.yaml specifications follow the respective best practices,
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:),
  • all fields of the example rules in the Snakefiles and their entries are explained via comments (input:/output:/params: etc.),
  • stderr and/or stdout are logged correctly (log:), depending on the wrapped tool,
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to (see here; this also means that using any Python tempfile default behavior works),
  • the meta.yaml contains a link to the documentation of the respective tool or command,
  • Snakefiles pass the linting (snakemake --lint),
  • Snakefiles are formatted with snakefmt,
  • Python wrapper scripts are formatted with black.
  • Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays).

meta/bio/rbt_calc_consensus/meta.yaml Outdated Show resolved Hide resolved
meta/bio/rbt_calc_consensus/test/common.smk Outdated Show resolved Hide resolved
meta/bio/rbt_calc_consensus/test/Snakefile Outdated Show resolved Hide resolved
meta/bio/rbt_calc_consensus/test/Snakefile Outdated Show resolved Hide resolved
meta/bio/rbt_calc_consensus/test/Snakefile Outdated Show resolved Hide resolved
meta/bio/rbt_calc_consensus/test/Snakefile Outdated Show resolved Hide resolved
Comment on lines 1 to 14
rule samtools_sort:
input:
"{path}/{sample}.bam",
output:
temp("{path}/{sample}.sorted.bam"),
params:
extra="-m 4G",
tmp_dir="/tmp/",
log:
"{path}_{sample}.log",
# Samtools takes additional threads through its option -@
threads: 8 # This value - 1 will be sent to -@.
wrapper:
"master/bio/samtools/sort"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add such a generic rule to a meta-wrapper, as it can interfere with other sort rules in a workflow that uses the meta-wrapper. Rather make the rule very specific to the particular case you need here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to a fixed directory and set an appropriate rule name.
Previously, that rule was used two times. First ensuring that the initial input of the workflow gets sorted and another time later within the workflow of meta wrapper.
Now, I changed the wrapper in a way that the initial bam file is expected to be presorted as an unsorted bam-file would require the sorting rule to be generic.

meta/bio/rbt_calc_consensus/test/Snakefile Outdated Show resolved Hide resolved
@johanneskoester johanneskoester merged commit 6736211 into master Aug 29, 2022
@johanneskoester johanneskoester deleted the consensus_meta branch August 29, 2022 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants