Skip to content

Commit

Permalink
fix: use new java memory overhead utils (#1150)
Browse files Browse the repository at this point in the history
### Description

This updates all uses of `snakemake-wrapper-utils` to the latest version
`0.5.2`. Mostly, to automatically have all the wrappers that use the
`get_java_opts()` function reserve 20% of the reserved memory for
overhead of the JVM. In some cases, I have increased that default, where
the documentation indicated that the overhead is usually larger. And in
some cases, I switched the specification in the example Snakefile from
`mem_gb` to the canonical `mem_mb`, which is sure to be interpreted
correctly for job submissions on cluster systems.

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] I confirm that:

For all wrappers added by this PR, 

* there is a test case which covers any introduced changes,
* `input:` and `output:` file paths in the resulting rule can be changed
arbitrarily,
* either the wrapper can only use a single core, or the example rule
contains a `threads: x` statement with `x` being a reasonable default,
* rule names in the test case are in
[snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell
what the rule is about or match the tools purpose or name (e.g.,
`map_reads` for a step that maps reads),
* all `environment.yaml` specifications follow [the respective best
practices](https://stackoverflow.com/a/64594513/2352071),
* wherever possible, command line arguments are inferred and set
automatically (e.g. based on file extensions in `input:` or `output:`),
* all fields of the example rules in the `Snakefile`s and their entries
are explained via comments (`input:`/`output:`/`params:` etc.),
* `stderr` and/or `stdout` are logged correctly (`log:`), depending on
the wrapped tool,
* temporary files are either written to a unique hidden folder in the
working directory, or (better) stored where the Python function
`tempfile.gettempdir()` points to (see
[here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir);
this also means that using any Python `tempfile` default behavior
works),
* the `meta.yaml` contains a link to the documentation of the respective
tool or command,
* `Snakefile`s pass the linting (`snakemake --lint`),
* `Snakefile`s are formatted with
[snakefmt](https://github.com/snakemake/snakefmt),
* Python wrapper scripts are formatted with
[black](https://black.readthedocs.io).
* Conda environments use a minimal amount of channels, in recommended
ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as
conda-forge should have highest priority and defaults channels are
usually not needed because most packages are in conda-forge nowadays).
  • Loading branch information
dlaehnemann committed Mar 23, 2023
1 parent cb1372b commit d15d5f5
Show file tree
Hide file tree
Showing 110 changed files with 129 additions and 109 deletions.
2 changes: 1 addition & 1 deletion bio/bazam/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bazam =1.0.1
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
8 changes: 6 additions & 2 deletions bio/bazam/test/Snakefile
Expand Up @@ -5,7 +5,9 @@ rule bazam_interleaved:
output:
reads="results/reads/{sample}.fastq.gz",
resources:
mem_mb=12000,
# suggestion according to:
# https://github.com/ssadedin/bazam/blob/c5988daf4cda4492e3d519c94f2f1e2022af5efe/README.md?plain=1#L46-L55
mem_mb=lambda wildcards, input: max([0.2 * input.size_mb, 200]),
log:
"logs/bazam/{sample}.log",
wrapper:
Expand All @@ -21,7 +23,9 @@ rule bazam_separated:
r1="results/reads/{sample}.r1.fastq.gz",
r2="results/reads/{sample}.r2.fastq.gz",
resources:
mem_mb=12000,
# suggestion according to:
# https://github.com/ssadedin/bazam/blob/c5988daf4cda4492e3d519c94f2f1e2022af5efe/README.md?plain=1#L46-L55
mem_mb=lambda wildcards, input: max([0.4 * input.size_mb, 200]),
log:
"logs/bazam/{sample}.log",
wrapper:
Expand Down
2 changes: 1 addition & 1 deletion bio/bbtools/bbduk/environment.yaml
Expand Up @@ -4,5 +4,5 @@ channels:
- nodefaults
dependencies:
- bbmap =39.01
- snakemake-wrapper-utils =0.5.2
- python =3.11.0
- snakemake-wrapper-utils =0.5.0
4 changes: 4 additions & 0 deletions bio/bbtools/bbduk/test/Snakefile
Expand Up @@ -11,6 +11,8 @@ rule bbduk_se:
"logs/bbduk/se/{sample}.log"
params:
extra = lambda w, input: "ref={},adapters,artifacts ktrim=r k=23 mink=11 hdist=1 tpe tbo trimpolygright=10 minlen=25 maxns=30 entropy=0.5 entropywindow=50 entropyk=5".format(input.adapters),
resources:
mem_mb=4000,
threads: 7
wrapper:
"master/bio/bbtools/bbduk"
Expand All @@ -29,6 +31,8 @@ rule bbduk_pe:
"logs/bbduk/pe/{sample}.log"
params:
extra = lambda w, input: "ref={},adapters,artifacts ktrim=r k=23 mink=11 hdist=1 tpe tbo trimpolygright=10 minlen=25 maxns=30 entropy=0.5 entropywindow=50 entropyk=5".format(input.adapters),
resources:
mem_mb=4000,
threads: 7
wrapper:
"master/bio/bbtools/bbduk"
2 changes: 1 addition & 1 deletion bio/bcftools/call/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/concat/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/index/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/merge/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/mpileup/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/norm/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/norm/test/Snakefile
Expand Up @@ -3,7 +3,7 @@ rule norm_vcf:
"{prefix}.bcf",
#ref="genome.fasta" # optional reference (will be translated into the -f option)
output:
"{prefix}.norm.vcf", # can also be .bcf, corresponding --output-type parameter is inferred automatically
"{prefix}.norm.vcf", # can also be .bcf, corresponding --output-type parameter is inferred automatically
log:
"{prefix}.norm.log",
params:
Expand Down
2 changes: 1 addition & 1 deletion bio/bcftools/reheader/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/sort/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/stats/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bcftools/view/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bellerophon/environment.yaml
Expand Up @@ -5,4 +5,4 @@ channels:
dependencies:
- bellerophon =1.0
- samtools =1.16.1
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bowtie2/align/environment.yaml
Expand Up @@ -5,4 +5,4 @@ channels:
dependencies:
- bowtie2 =2.5.0
- samtools =1.16.1
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bustools/sort/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- bustools =0.42.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bwa-mem2/mem/environment.yaml
Expand Up @@ -6,4 +6,4 @@ dependencies:
- bwa-mem2 =2.2.1
- samtools =1.16.1
- picard-slim =3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/bwa/mem/environment.yaml
Expand Up @@ -6,4 +6,4 @@ dependencies:
- bwa =0.7.17
- samtools =1.16.1
- picard-slim =2.27.4
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/delly/environment.yaml
Expand Up @@ -5,4 +5,4 @@ channels:
dependencies:
- delly =1.1.6
- bcftools =1.16
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/dragmap/align/environment.yaml
Expand Up @@ -6,4 +6,4 @@ dependencies:
- dragmap =1.2
- samtools =1.14
- picard =2.26
- snakemake-wrapper-utils =0.3
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/fgbio/annotatebamwithumis/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- fgbio =2.1.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
5 changes: 4 additions & 1 deletion bio/fgbio/annotatebamwithumis/test/Snakefile
Expand Up @@ -6,7 +6,10 @@ rule AnnotateBam:
"mapped/{sample}.annotated.bam",
params: ""
resources:
mem_gb="4" # memory to be given to fgbio
# suggestion assuming unsorted input, so that memory should
# be proportional to input size:
# https://fulcrumgenomics.github.io/fgbio/tools/latest/AnnotateBamWithUmis.html
mem_mb=lambda wildcards, input: max([input.size_mb * 1.3, 200])
log:
"logs/fgbio/annotate_bam/{sample}.log",
wrapper:
Expand Down
2 changes: 1 addition & 1 deletion bio/gatk/applybqsr/environment.yaml
Expand Up @@ -4,5 +4,5 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
- samtools =1.16.1
2 changes: 1 addition & 1 deletion bio/gatk/applybqsrspark/environment.yaml
Expand Up @@ -4,5 +4,5 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
- samtools =1.16.1
2 changes: 1 addition & 1 deletion bio/gatk/applyvqsr/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/applyvqsr/test/Snakefile
Expand Up @@ -12,6 +12,6 @@ rule apply_vqsr:
mode="SNP", # set mode, must be either SNP, INDEL or BOTH
extra="", # optional
resources:
mem_mb=50,
mem_mb=1024,
wrapper:
"master/bio/gatk/applyvqsr"
2 changes: 1 addition & 1 deletion bio/gatk/baserecalibrator/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/baserecalibratorspark/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/cleansam/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/combinegvcfs/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/depthofcoverage/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/estimatelibrarycomplexity/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/filtermutectcalls/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/genomicsdbimport/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.2
- snakemake-wrapper-utils =0.5
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/genomicsdbimport/test/Snakefile
Expand Up @@ -11,6 +11,6 @@ rule genomics_db_import:
extra="", # optional
java_opts="", # optional
resources:
mem_mb=1024,
mem_mb=lambda wildcards, input: max([input.size_mb * 1.6, 200]),
wrapper:
"master/bio/gatk/genomicsdbimport"
5 changes: 4 additions & 1 deletion bio/gatk/genomicsdbimport/wrapper.py
Expand Up @@ -10,7 +10,10 @@


extra = snakemake.params.get("extra", "")
java_opts = get_java_opts(snakemake)
# uses Java native library TileDB, which requires a lot of memory outside
# of the `-Xmx` memory, so we reserve 40% instead of the default 20%. See:
# https://gatk.broadinstitute.org/hc/en-us/articles/9570326648475-GenomicsDBImportGenomicsDBImport
java_opts = get_java_opts(snakemake, java_mem_overhead_factor=0.4)

gvcfs = list(map("--variant {}".format, snakemake.input.gvcfs))

Expand Down
2 changes: 1 addition & 1 deletion bio/gatk/genotypegvcfs/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/getpileupsummaries/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/haplotypecaller/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/intervallisttobed/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/learnreadorientationmodel/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/leftalignandtrimvariants/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
4 changes: 3 additions & 1 deletion bio/gatk/markduplicatesspark/test/Snakefile
Expand Up @@ -13,7 +13,9 @@ rule mark_duplicates_spark:
#spark_master="", # optional
#spark_extra="", # optional
resources:
mem_mb=1024,
# Memory needs to be at least 471859200 for Spark, so 589824000 when
# accounting for default JVM overhead of 20%. We round round to 650M.
mem_mb=lambda wildcards, input: max([input.size_mb * 0.25, 650]),
threads: 8
wrapper:
"master/bio/gatk/markduplicatesspark"
2 changes: 1 addition & 1 deletion bio/gatk/mutect/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/printreadsspark/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/scatterintervalsbyns/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/selectvariants/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/splitintervals/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/splitncigarreads/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/variantannotator/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/varianteval/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/variantfiltration/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk/variantrecalibrator/environment.yaml
Expand Up @@ -4,6 +4,6 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
- google-cloud-sdk
- google-crc32c
2 changes: 1 addition & 1 deletion bio/gatk/variantstotable/environment.yaml
Expand Up @@ -4,4 +4,4 @@ channels:
- nodefaults
dependencies:
- gatk4 =4.3.0.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2
2 changes: 1 addition & 1 deletion bio/gatk3/baserecalibrator/environment.yaml
Expand Up @@ -5,4 +5,4 @@ channels:
dependencies:
- gatk =3.8
- python =3.11.0
- snakemake-wrapper-utils =0.5.0
- snakemake-wrapper-utils =0.5.2

0 comments on commit d15d5f5

Please sign in to comment.