Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: bug with -O being used for gap opening penalty and samtools output directory #1450

Merged
merged 16 commits into from Jun 30, 2023

Conversation

huzuner
Copy link
Contributor

@huzuner huzuner commented Jun 20, 2023

Description

Gap penalty of minimap2 is done via "-O" option and the very same option also exists for samtools, but to indicate output directory. Using "-O" in the "extra" leads to the following error because of using snakemake utils for samtools (https://github.com/snakemake/snakemake-wrapper-utils/blob/f9cba17cb380dc7e6729a845ae37f26972c3d1dd/snakemake_wrapper_utils/samtools.py#LL76C23-L76C23):

You have specified output format (`-O/--output-fmt`) in params.extra; this is automatically inferred from output file extension.

The fastest and not over-engineered solution to me was to add the gap_opening variable to params, but I feel that it's also not very elegant especially when it comes to modifying the other penalty scores, but they can still be set in the extra.

QC

  • I confirm that:

For all wrappers added by this PR,

  • there is a test case which covers any introduced changes,
  • input: and output: file paths in the resulting rule can be changed arbitrarily,
  • either the wrapper can only use a single core, or the example rule contains a threads: x statement with x being a reasonable default,
  • rule names in the test case are in snake_case and somehow tell what the rule is about or match the tools purpose or name (e.g., map_reads for a step that maps reads),
  • all environment.yaml specifications follow the respective best practices,
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:),
  • all fields of the example rules in the Snakefiles and their entries are explained via comments (input:/output:/params: etc.),
  • stderr and/or stdout are logged correctly (log:), depending on the wrapped tool,
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to (see here; this also means that using any Python tempfile default behavior works),
  • the meta.yaml contains a link to the documentation of the respective tool or command,
  • Snakefiles pass the linting (snakemake --lint),
  • Snakefiles are formatted with snakefmt,
  • Python wrapper scripts are formatted with black.
  • Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays).

Copy link
Contributor

@dlaehnemann dlaehnemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looking at this in more detail, this makes sense. The get_samtools_opts is actually only used when samtools is used for sorting of the output. So this solution is probably the best and easiest.

One thing though. Could you document this further by:

  • adding an entry for this new params field in the meta.yaml under notes::
    notes: |
    * The `extra` param allows for additional arguments for minimap2.
    * The `sort` param allows to enable sorting (if output not PAF), and can be either 'none', 'queryname' or 'coordinate'.
    * The `sort_extra` allows for extra arguments for samtools/picard
  • adding this new params: gap_opening: "" specification to at least one of the example rules with sorted sam or bam output, so for example here:
    params:
    extra="-x map-pb", # optional
    sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
    sort_extra="", # optional: extra arguments for samtools/picard

@huzuner
Copy link
Contributor Author

huzuner commented Jun 21, 2023

Thank you! How does it look now?

Copy link
Contributor

@dlaehnemann dlaehnemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, just one minor formatting suggestion -- feel free to include it or not.

bio/minimap2/aligner/wrapper.py Outdated Show resolved Hide resolved
Co-authored-by: David Laehnemann <david.laehnemann@hhu.de>
Copy link
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not have a special parameter. I think it is a bug that the wrapper utils complain here about -O, they should only check sort_extra.

@dlaehnemann
Copy link
Contributor

But then you will have to change all the samtools wrappers, as get_samtools_opts() just parses params: extra:, no other params: entry. So this would maybe have to become a params: samtools_extra wherever get_samtools_opts() is used. Or we cannot use get_samtools_opts() here at all. Or we have to generalize the function to have another argument which determines which params: field to parse. Maybe this last option (with the default set to "extra") is the least breaking option.

@johanneskoester
Copy link
Contributor

But then you will have to change all the samtools wrappers, as get_samtools_opts() just parses params: extra:, no other params: entry. So this would maybe have to become a params: samtools_extra wherever get_samtools_opts() is used. Or we cannot use get_samtools_opts() here at all. Or we have to generalize the function to have another argument which determines which params: field to parse. Maybe this last option (with the default set to "extra") is the least breaking option.

Exactly the latter was what I meant. Here is the PR: snakemake/snakemake-wrapper-utils#30

@johanneskoester
Copy link
Contributor

snakemake-wrapper-utils 0.6 have been released with the fix. Hence, this PR should just add a keyword argument param_name="sort_extra" to the call of get_samtools_opts.

@fgvieira
Copy link
Collaborator

This seems to be the same issue described in #624.
Do you think you could also fix the bwa-mem, bwa sampe, bwa samse, and bwa sam(pe/se) wrappers?

PS - maybe bwa sampe and bwa samse should actually be deleted, since there is already bwa sam(pe/se)...

@huzuner huzuner changed the title feat: add gap_penalty to params fix: bug with -O being used for gap opening penalty and samtools output directory Jun 22, 2023
@huzuner
Copy link
Contributor Author

huzuner commented Jun 23, 2023

The tests currently fail because the conda package for snakemake-wrappers-utils isn't updated yet.

@fgvieira
Copy link
Collaborator

Some other errors, related to memory.

@huzuner
Copy link
Contributor Author

huzuner commented Jun 27, 2023

I think removing

        sort_extra="--write-index",  # Extra args for samtools/picard.

from Snakefile_samtools of bwa/mem would fix the

You have specified writing index (`--write-index`) in `params.extra`; this is automatically infered from `idx` output file.

error but I don't know about the memory error yet.

@huzuner
Copy link
Contributor Author

huzuner commented Jun 28, 2023

We get the error:

Invalid maximum heap size: -Xmx0M
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Regardless of the changes in the current PR, -Xmx0m returns 0 here related to a recent improvement in gathering resources in get_mem() in snakemake-wrapper-utils: https://github.com/snakemake/snakemake-wrapper-utils/blob/10ecf786c1bc3bcf6197eb8c6fb7f8615f0df2ee/snakemake_wrapper_utils/snakemake.py#L1

dlaehnemann added a commit to snakemake/snakemake-wrapper-utils that referenced this pull request Jun 29, 2023
otherwise we can get cases where the Java virtual machine tries to assign `-Xmx0M` and fails, see:
snakemake/snakemake-wrappers#1450 (comment)
dlaehnemann added a commit to snakemake/snakemake-wrapper-utils that referenced this pull request Jun 29, 2023
otherwise we can get cases where the Java virtual machine tries to assign `-Xmx0M` and fails, see:
snakemake/snakemake-wrappers#1450 (comment)
@dlaehnemann
Copy link
Contributor

I just merged and released a non-zero default value setting for get_mem over at snakemake-wrapper-utils:
snakemake/snakemake-wrapper-utils#32

Once the new version 0.6.1 of snakemake-wrapper-utils is up on bioconda, pointing to it should fix this issue here.

@dlaehnemann
Copy link
Contributor

Version 0.6.1 of snakemake-wrapper-utils is available via bioconda, now.

@dlaehnemann
Copy link
Contributor

Ah, and I merged in the latest master branch, so you should probably pull the latest changes before further code changes locally on your machine.

@huzuner
Copy link
Contributor Author

huzuner commented Jun 29, 2023

Thank you!
I now have updated the versions of snakemake-wrapper-utils for bwa/mem and bwa-mem2.

@johanneskoester johanneskoester merged commit c3f227f into snakemake:master Jun 30, 2023
6 checks passed
@huzuner huzuner deleted the minimap2-wrapper branch June 30, 2023 11:37
johanneskoester pushed a commit that referenced this pull request Jun 30, 2023
🤖 I have created a release \*beep\* \*boop\*
---
###
[2.1.1](https://www.github.com/snakemake/snakemake-wrappers/compare/v2.1.0...v2.1.1)
(2023-06-30)


### Bug Fixes

* bug with -O being used for gap opening penalty and samtools output
directory
([#1450](https://www.github.com/snakemake/snakemake-wrappers/issues/1450))
([c3f227f](https://www.github.com/snakemake/snakemake-wrappers/commit/c3f227ff64ee0d694587b50340424447ee847746))


### Performance Improvements

* autobump bio/bcftools/mpileup
([#1483](https://www.github.com/snakemake/snakemake-wrappers/issues/1483))
([438c39b](https://www.github.com/snakemake/snakemake-wrappers/commit/438c39bd15816b068193664bb37ec8dfd1e21b00))
* autobump bio/bcftools/view
([#1469](https://www.github.com/snakemake/snakemake-wrappers/issues/1469))
([1025cee](https://www.github.com/snakemake/snakemake-wrappers/commit/1025cee51c5fdb005610781f394e36f77d3a4fad))
* autobump bio/bedtools/coveragebed
([#1487](https://www.github.com/snakemake/snakemake-wrappers/issues/1487))
([b9b6123](https://www.github.com/snakemake/snakemake-wrappers/commit/b9b612307055dfc9ee3cf8a6aa6cf51419a1a9de))
* autobump bio/bismark/bam2nuc
([#1491](https://www.github.com/snakemake/snakemake-wrappers/issues/1491))
([b7fe91a](https://www.github.com/snakemake/snakemake-wrappers/commit/b7fe91a39563ad2839d835128a126ff4b7a16833))
* autobump bio/diamond/makedb
([#1495](https://www.github.com/snakemake/snakemake-wrappers/issues/1495))
([5885a0e](https://www.github.com/snakemake/snakemake-wrappers/commit/5885a0e6d00b1005d2ecef4679cfac0119404f10))
* autobump bio/freebayes
([#1478](https://www.github.com/snakemake/snakemake-wrappers/issues/1478))
([dd4c02b](https://www.github.com/snakemake/snakemake-wrappers/commit/dd4c02b81c920e69826c3a86e7ef9f9074b2e76f))
* autobump bio/gatk/applybqsr
([#1470](https://www.github.com/snakemake/snakemake-wrappers/issues/1470))
([7aca9d9](https://www.github.com/snakemake/snakemake-wrappers/commit/7aca9d992096602f4bdf6ec252aa2376c0bfc990))
* autobump bio/gatk/estimatelibrarycomplexity
([#1484](https://www.github.com/snakemake/snakemake-wrappers/issues/1484))
([b06df5f](https://www.github.com/snakemake/snakemake-wrappers/commit/b06df5fd5096097902ef26ffe07520a6728d6850))
* autobump bio/gatk/genomicsdbimport
([#1492](https://www.github.com/snakemake/snakemake-wrappers/issues/1492))
([826e722](https://www.github.com/snakemake/snakemake-wrappers/commit/826e722ee20720483b6a2a894482cde9268dfd18))
* autobump bio/gatk/intervallisttobed
([#1473](https://www.github.com/snakemake/snakemake-wrappers/issues/1473))
([214dd6b](https://www.github.com/snakemake/snakemake-wrappers/commit/214dd6bb41ae07d985d603f7f04e53139ac0ee50))
* autobump bio/gatk/learnreadorientationmodel
([#1494](https://www.github.com/snakemake/snakemake-wrappers/issues/1494))
([e2623fd](https://www.github.com/snakemake/snakemake-wrappers/commit/e2623fd5111df37291918ca6768279ae98f0b694))
* autobump bio/gatk/splitintervals
([#1481](https://www.github.com/snakemake/snakemake-wrappers/issues/1481))
([444dda4](https://www.github.com/snakemake/snakemake-wrappers/commit/444dda46364af8f2ed6baa42b6f0586b51b4808f))
* autobump bio/gatk/variantannotator
([#1489](https://www.github.com/snakemake/snakemake-wrappers/issues/1489))
([03b3fb5](https://www.github.com/snakemake/snakemake-wrappers/commit/03b3fb581e4bb062ce2128215a898c692650343f))
* autobump bio/gatk/variantrecalibrator
([#1496](https://www.github.com/snakemake/snakemake-wrappers/issues/1496))
([3748dec](https://www.github.com/snakemake/snakemake-wrappers/commit/3748dec7b916254d8777c16c37114b8d0a618476))
* autobump bio/minimap2/aligner
([#1477](https://www.github.com/snakemake/snakemake-wrappers/issues/1477))
([75c6265](https://www.github.com/snakemake/snakemake-wrappers/commit/75c6265df3f888acd849a1617164b51c42cd3a0d))
* autobump bio/picard/collectalignmentsummarymetrics
([#1475](https://www.github.com/snakemake/snakemake-wrappers/issues/1475))
([06e9c4b](https://www.github.com/snakemake/snakemake-wrappers/commit/06e9c4be45b3c6596e9c0fdde2eacb6eb2a83bdb))
* autobump bio/picard/mergevcfs
([#1465](https://www.github.com/snakemake/snakemake-wrappers/issues/1465))
([dbbc182](https://www.github.com/snakemake/snakemake-wrappers/commit/dbbc182bab3b3496a1cbbbf11816382a93fcc5f2))
* autobump bio/pretext/map
([#1493](https://www.github.com/snakemake/snakemake-wrappers/issues/1493))
([7fcae7f](https://www.github.com/snakemake/snakemake-wrappers/commit/7fcae7f60fbbdc7073af5e31b1bd1dbc2dacf96b))
* autobump bio/qualimap/bamqc
([#1471](https://www.github.com/snakemake/snakemake-wrappers/issues/1471))
([4fe5fdc](https://www.github.com/snakemake/snakemake-wrappers/commit/4fe5fdcb0a4aaa0229c52d50188402030ed656c3))
* autobump bio/samtools/depth
([#1486](https://www.github.com/snakemake/snakemake-wrappers/issues/1486))
([6d23daf](https://www.github.com/snakemake/snakemake-wrappers/commit/6d23daf52b1746f64ffb4f01f66202cd38bdb12f))
* autobump bio/samtools/stats
([#1468](https://www.github.com/snakemake/snakemake-wrappers/issues/1468))
([adc059b](https://www.github.com/snakemake/snakemake-wrappers/commit/adc059b4758b1374eb445d0652194b68a8826a14))
* autobump bio/snpsift/annotate
([#1480](https://www.github.com/snakemake/snakemake-wrappers/issues/1480))
([a60f667](https://www.github.com/snakemake/snakemake-wrappers/commit/a60f66793d05dd707b8834041cb69a0c325fe49b))
* autobump bio/snpsift/dbnsfp
([#1490](https://www.github.com/snakemake/snakemake-wrappers/issues/1490))
([8e8af31](https://www.github.com/snakemake/snakemake-wrappers/commit/8e8af3118fc6e8f6eeb00bd3412e780f775d36e1))
* autobump bio/varscan/mpileup2indel
([#1472](https://www.github.com/snakemake/snakemake-wrappers/issues/1472))
([c0a7c47](https://www.github.com/snakemake/snakemake-wrappers/commit/c0a7c475e70e978cbe46e67afa3b70bc16019dfb))
* autobump bio/varscan/mpileup2snp
([#1476](https://www.github.com/snakemake/snakemake-wrappers/issues/1476))
([a439e5f](https://www.github.com/snakemake/snakemake-wrappers/commit/a439e5f8b8a5362c28751079beb544964a0af7a4))
* autobump bio/varscan/somatic
([#1482](https://www.github.com/snakemake/snakemake-wrappers/issues/1482))
([18671ad](https://www.github.com/snakemake/snakemake-wrappers/commit/18671adc6821aad751a67845b0e453456bea6475))
* autobump bio/vg/kmers
([#1479](https://www.github.com/snakemake/snakemake-wrappers/issues/1479))
([aa00d14](https://www.github.com/snakemake/snakemake-wrappers/commit/aa00d1488989e64ce215672c86bbc94d169f6f33))
* autobump bio/xsv
([#1485](https://www.github.com/snakemake/snakemake-wrappers/issues/1485))
([a0baa27](https://www.github.com/snakemake/snakemake-wrappers/commit/a0baa27897c3e11d439a5bbb3dd138029d869afe))
* update datavzrd
([#1497](https://www.github.com/snakemake/snakemake-wrappers/issues/1497))
([143a9c8](https://www.github.com/snakemake/snakemake-wrappers/commit/143a9c8caadb5f4391b351dfc0786c896c751c3f))
---


This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants