Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: fix threads and new IO options in Bowtie2 (#1324)
<!-- Ensure that the PR title follows conventional commit style (<type>: <description>)--> <!-- Possible types are here: https://github.com/commitizen/conventional-commit-types/blob/master/index.json --> ### Description Fixes the following issues in the Bowtie2 wrapper: * For `x` threads reserved in the snakemake rule, `2x` threads were used. (Once in Bowtie2, once again in the Samtools pipe). This is now fixed. For `x` threads reserved in the snakemake rule, `x-1` is used in Bowtie2, and 1 in through the Samtools pipe. * Optional metrics and alignment are now available through `output` interface. * Samtools compressed BAM/CRAM file does now explicitly keep header. * Samtool indexing and compression is now documented in `meta.yaml` and tested. * Input file format is now automatically inferred when it is possible. Open questions: * `snakemake-wrappers-utils.samtools.get_samtools_opts()` tests `params.extra` parameters. How do I let it test `params.extra_samtools` ? Then, we could let user set-up extra parameters for samtools view. * Bowtie2 can concatenate multiple input files, as long as they are separated by commas. This could be implemented in the wrapper. However, when user inputs exactly two files in `input.sample`, then I can't find how to guess if they need to be concatenated through `-U` option, or considered as paired reads through `-1` and `-2`. My solutions always include breaking retro-compatibility. Both open questions can be ignored and the wrapper would still follow its current behaviour. They are not breaking/blocking. ### QC <!-- Make sure that you can tick the boxes below. --> * [X] I confirm that: For all wrappers added by this PR, * there is a test case which covers any introduced changes, * `input:` and `output:` file paths in the resulting rule can be changed arbitrarily, * either the wrapper can only use a single core, or the example rule contains a `threads: x` statement with `x` being a reasonable default, * rule names in the test case are in [snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell what the rule is about or match the tools purpose or name (e.g., `map_reads` for a step that maps reads), * all `environment.yaml` specifications follow [the respective best practices](https://stackoverflow.com/a/64594513/2352071), * wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in `input:` or `output:`), * all fields of the example rules in the `Snakefile`s and their entries are explained via comments (`input:`/`output:`/`params:` etc.), * `stderr` and/or `stdout` are logged correctly (`log:`), depending on the wrapped tool, * temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function `tempfile.gettempdir()` points to (see [here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir); this also means that using any Python `tempfile` default behavior works), * the `meta.yaml` contains a link to the documentation of the respective tool or command, * `Snakefile`s pass the linting (`snakemake --lint`), * `Snakefile`s are formatted with [snakefmt](https://github.com/snakemake/snakefmt), * Python wrapper scripts are formatted with [black](https://black.readthedocs.io). * Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays). --------- Co-authored-by: tdayris <tdayris@gustaveroussy.fr> Co-authored-by: tdayris <thibault.dayris@gustaveroussy.fr> Co-authored-by: Johannes Köster <johannes.koester@uni-due.de> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com> Co-authored-by: Felix Mölder <felix.moelder@uni-due.de> Co-authored-by: Christopher Schröder <christopher.schroeder@tu-dortmund.de>
- Loading branch information