Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
<!-- Ensure that the PR title follows conventional commit style (<type>: <description>)--> <!-- Possible types are here: https://github.com/commitizen/conventional-commit-types/blob/master/index.json --> <!-- Add a description of your PR here--> This PR adds [`pyTMB`](https://github.com/bioinfo-pf-curie/TMB) to the list of available wrappers. ### QC <!-- Make sure that you can tick the boxes below. --> * [X] I confirm that: For all wrappers added by this PR, * there is a test case which covers any introduced changes, * `input:` and `output:` file paths in the resulting rule can be changed arbitrarily, * either the wrapper can only use a single core, or the example rule contains a `threads: x` statement with `x` being a reasonable default, * rule names in the test case are in [snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell what the rule is about or match the tools purpose or name (e.g., `map_reads` for a step that maps reads), * all `environment.yaml` specifications follow [the respective best practices](https://stackoverflow.com/a/64594513/2352071), * the `environment.yaml` pinning has been updated by running `snakedeploy pin-conda-envs environment.yaml` on a linux machine, * wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in `input:` or `output:`), * all fields of the example rules in the `Snakefile`s and their entries are explained via comments (`input:`/`output:`/`params:` etc.), * `stderr` and/or `stdout` are logged correctly (`log:`), depending on the wrapped tool, * temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function `tempfile.gettempdir()` points to (see [here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir); this also means that using any Python `tempfile` default behavior works), * the `meta.yaml` contains a link to the documentation of the respective tool or command, * `Snakefile`s pass the linting (`snakemake --lint`), * `Snakefile`s are formatted with [snakefmt](https://github.com/snakemake/snakefmt), * Python wrapper scripts are formatted with [black](https://black.readthedocs.io). * Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays). --------- Co-authored-by: tdayris <tdayris@gustaveroussy.fr> Co-authored-by: tdayris <thibault.dayris@gustaveroussy.fr> Co-authored-by: Johannes Köster <johannes.koester@uni-due.de> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: snakedeploy-bot[bot] <115615832+snakedeploy-bot[bot]@users.noreply.github.com> Co-authored-by: Felix Mölder <felix.moelder@uni-due.de> Co-authored-by: Christopher Schröder <christopher.schroeder@tu-dortmund.de>
- Loading branch information
1 parent
af63b5b
commit 0ffbed9
Showing
10 changed files
with
312 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# This file may be used to create an environment using: | ||
# $ conda create --name <env> --file <this file> | ||
# platform: linux-64 | ||
@EXPLICIT | ||
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81 | ||
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2024.2.2-hbcca054_0.conda#2f4327a1cbe7f022401b236e915a5fef | ||
https://conda.anaconda.org/conda-forge/linux-64/ld_impl_linux-64-2.40-h41732ed_0.conda#7aca3059a1729aa76c597603f10b0dd3 | ||
https://conda.anaconda.org/conda-forge/linux-64/libstdcxx-ng-13.2.0-h7e041cc_5.conda#f6f6600d18a4047b54f803cf708b868a | ||
https://conda.anaconda.org/conda-forge/linux-64/python_abi-3.10-4_cp310.conda#26322ec5d7712c3ded99dd656142b8ce | ||
https://conda.anaconda.org/conda-forge/noarch/tzdata-2024a-h0c530f3_0.conda#161081fc7cec0bfda0d86d7cb595f8d8 | ||
https://conda.anaconda.org/conda-forge/linux-64/libgomp-13.2.0-h807b86a_5.conda#d211c42b9ce49aee3734fdc828731689 | ||
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d | ||
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-13.2.0-h807b86a_5.conda#d4ff227c46917d3b4565302a2bbb276b | ||
https://conda.anaconda.org/conda-forge/linux-64/bzip2-1.0.8-hd590300_5.conda#69b8b6202a07720f448be700e300ccf4 | ||
https://conda.anaconda.org/conda-forge/linux-64/c-ares-1.27.0-hd590300_0.conda#f6afff0e9ee08d2f1b897881a4f38cdb | ||
https://conda.anaconda.org/conda-forge/linux-64/keyutils-1.6.1-h166bdaf_0.tar.bz2#30186d27e2c9fa62b45fb1476b7200e3 | ||
https://conda.anaconda.org/conda-forge/linux-64/libdeflate-1.18-h0b41bf4_0.conda#6aa9c9de5542ecb07fdda9ca626252d8 | ||
https://conda.anaconda.org/conda-forge/linux-64/libev-4.33-hd590300_2.conda#172bf1cd1ff8629f2b1179945ed45055 | ||
https://conda.anaconda.org/conda-forge/linux-64/libffi-3.4.2-h7f98852_5.tar.bz2#d645c6d2ac96843a2bfaccd2d62b3ac3 | ||
https://conda.anaconda.org/conda-forge/linux-64/libgfortran5-13.2.0-ha4646dd_5.conda#7a6bd7a12a4bd359e2afe6c0fa1acace | ||
https://conda.anaconda.org/conda-forge/linux-64/libnsl-2.0.1-hd590300_0.conda#30fd6e37fe21f86f4bd26d6ee73eeec7 | ||
https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.38.1-h0b41bf4_0.conda#40b61aab5c7ba9ff276c41cfffe6b80b | ||
https://conda.anaconda.org/conda-forge/linux-64/libxcrypt-4.4.36-hd590300_1.conda#5aa797f8787fe7a17d1b0821485b5adc | ||
https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.2.13-hd590300_5.conda#f36c115f1ee199da648e0597ec2047ad | ||
https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.4-h59595ed_2.conda#7dbaa197d7ba6032caf7ae7f32c1efa0 | ||
https://conda.anaconda.org/conda-forge/linux-64/openssl-3.2.1-hd590300_0.conda#51a753e64a3027bd7e23a189b1f6e91e | ||
https://conda.anaconda.org/conda-forge/linux-64/xz-5.2.6-h166bdaf_0.tar.bz2#2161070d867d1b1204ea749c8eec4ef0 | ||
https://conda.anaconda.org/conda-forge/linux-64/yaml-0.2.5-h7f98852_2.tar.bz2#4cb3ad778ec2d5a7acbdf254eb1c42ae | ||
https://conda.anaconda.org/bioconda/linux-64/bedtools-2.31.1-hf5e1c6e_1.tar.bz2#2066287e826a2ff469fa0b62b24b6059 | ||
https://conda.anaconda.org/conda-forge/linux-64/libedit-3.1.20191231-he28a2e2_2.tar.bz2#4d331e44109e3f0e19b4cb8f9b82f3e1 | ||
https://conda.anaconda.org/conda-forge/linux-64/libgfortran-ng-13.2.0-h69a702a_5.conda#e73e9cfd1191783392131e6238bdb3e9 | ||
https://conda.anaconda.org/conda-forge/linux-64/libnghttp2-1.58.0-h47da74e_1.conda#700ac6ea6d53d5510591c4344d5c989a | ||
https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.45.1-h2797004_0.conda#fc4ccadfbf6d4784de88c41704792562 | ||
https://conda.anaconda.org/conda-forge/linux-64/libssh2-1.11.0-h0841786_0.conda#1f5a58e686b13bcfde88b93f547d23fe | ||
https://conda.anaconda.org/conda-forge/linux-64/readline-8.2-h8228510_1.conda#47d31b792659ce70f470b5c82fdfb7a4 | ||
https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_h4845f30_101.conda#d453b98d9c83e71da0741bb0ff4d76bc | ||
https://conda.anaconda.org/conda-forge/linux-64/zstd-1.5.5-hfc55251_0.conda#04b88013080254850d6c01ed54810589 | ||
https://conda.anaconda.org/conda-forge/linux-64/krb5-1.21.2-h659d440_0.conda#cd95826dbd331ed1be26bdf401432844 | ||
https://conda.anaconda.org/conda-forge/linux-64/libopenblas-0.3.26-pthreads_h413a1c8_0.conda#760ae35415f5ba8b15d09df5afe8b23a | ||
https://conda.anaconda.org/conda-forge/linux-64/python-3.10.13-hd12c33a_1_cpython.conda#ed38140af93f81319ebc472fbcf16cca | ||
https://conda.anaconda.org/conda-forge/noarch/click-8.1.7-unix_pyh707e725_0.conda#f3ad426304898027fc619827ff428eca | ||
https://conda.anaconda.org/conda-forge/noarch/humanfriendly-10.0-pyhd8ed1ab_6.conda#2ed1fe4b9079da97c44cfe9c2e5078fd | ||
https://conda.anaconda.org/conda-forge/linux-64/libblas-3.9.0-21_linux64_openblas.conda#0ac9f44fc096772b0aa092119b00c3ca | ||
https://conda.anaconda.org/conda-forge/linux-64/libcurl-8.5.0-hca28451_0.conda#7144d5a828e2cae218e0e3c98d8a0aeb | ||
https://conda.anaconda.org/conda-forge/noarch/python-tzdata-2024.1-pyhd8ed1ab_0.conda#98206ea9954216ee7540f0c773f2104d | ||
https://conda.anaconda.org/conda-forge/noarch/pytz-2024.1-pyhd8ed1ab_0.conda#3eeeeb9e4827ace8c0c1419c85d590ad | ||
https://conda.anaconda.org/conda-forge/linux-64/pyyaml-6.0.1-py310h2372a71_1.conda#bb010e368de4940771368bc3dc4c63e7 | ||
https://conda.anaconda.org/conda-forge/noarch/setuptools-69.1.1-pyhd8ed1ab_0.conda#576de899521b7d43674ba3ef6eae9142 | ||
https://conda.anaconda.org/conda-forge/noarch/six-1.16.0-pyh6c4a22f_0.tar.bz2#e5f25f8dbc060e9a8d912e432202afc2 | ||
https://conda.anaconda.org/conda-forge/noarch/wheel-0.42.0-pyhd8ed1ab_0.conda#1cdea58981c5cbc17b51973bcaddcea7 | ||
https://conda.anaconda.org/conda-forge/noarch/coloredlogs-15.0.1-pyhd8ed1ab_3.tar.bz2#7b4fc18b7f66382257c45424eaf81935 | ||
https://conda.anaconda.org/bioconda/linux-64/htslib-1.19.1-h81da01d_2.tar.bz2#ad57eedd99d6722b2f00a8f7d0d71e2a | ||
https://conda.anaconda.org/conda-forge/linux-64/libcblas-3.9.0-21_linux64_openblas.conda#4a3816d06451c4946e2db26b86472cb6 | ||
https://conda.anaconda.org/conda-forge/linux-64/liblapack-3.9.0-21_linux64_openblas.conda#1a42f305615c3867684e049e85927531 | ||
https://conda.anaconda.org/conda-forge/noarch/pip-24.0-pyhd8ed1ab_0.conda#f586ac1e56c8638b64f9c8122a7b8a67 | ||
https://conda.anaconda.org/bioconda/linux-64/pysam-0.22.0-py310h41dec4a_1.tar.bz2#19fdb9301a6debbb7fe9836670e3feb7 | ||
https://conda.anaconda.org/conda-forge/noarch/python-dateutil-2.9.0-pyhd8ed1ab_0.conda#2cf4264fffb9e6eff6031c5b6884d61c | ||
https://conda.anaconda.org/bioconda/linux-64/mosdepth-0.3.6-hd299d5a_0.tar.bz2#d600959c8132348d3a6994e2aa3a2134 | ||
https://conda.anaconda.org/conda-forge/linux-64/numpy-1.26.4-py310hb13e2d6_0.conda#6593de64c935768b6bad3e19b3e978be | ||
https://conda.anaconda.org/bioconda/linux-64/cyvcf2-0.30.28-py310hcf1fb4a_0.tar.bz2#232a76b24d3c3b44aa4e88d84a73872e | ||
https://conda.anaconda.org/conda-forge/linux-64/pandas-2.2.1-py310hcc13569_0.conda#cf5d315e3601a6a2931f63aa9a84dc40 | ||
https://conda.anaconda.org/bioconda/linux-64/pybedtools-0.9.1-py310h2b6aa90_0.tar.bz2#e561264a083c7b5a2b2290008460c9dd | ||
https://conda.anaconda.org/bioconda/noarch/tmb-1.3.0-pyh5e36f6f_0.tar.bz2#ef5e806d5a3f48d4568870df9c6ae7e1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
channels: | ||
- conda-forge | ||
- bioconda | ||
- nodefaults | ||
dependencies: | ||
- tmb=1.3.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
name: "pyTMB.py" | ||
description: Calculate a Tumor Mutational Burden (TMB) score from a VCF file | ||
url: "https://github.com/bioinfo-pf-curie/TMB?tab=readme-ov-file#tumor-mutational-burden" | ||
authors: | ||
- "Thibault Dayris" | ||
input: | ||
- vcf: Path to input variants (`vcf`, `vcf.gz`, or `bcf` formatted) | ||
- db_config: Path to database config file (`yaml` formatted) | ||
- var_config: Path to variant config file (`yaml` formatted) | ||
- bed: Path to intervals file to compute effective genome size (`bed` formatted) | ||
output: | ||
- res: Path to TMB results | ||
- vcf: Optional path to variants considered for TMB calculation | ||
params: | ||
- extra: Optional parameters provided to `pyTMB.py`, besides `-i`, `--dbConfig`, `--varConfig`, `--bed`, or `--export` | ||
note: | | ||
This wrapper executes the whole command in a temporary directory. The use of `shadow` directive | ||
in the Snakemake rule would be redundant. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
rule test_pytmb: | ||
input: | ||
vcf="sample.bcf", | ||
db_config="dbconfig.yaml", | ||
var_config="varconfig.yaml", | ||
bed="regions.bed", | ||
output: | ||
res="tmb.txt", | ||
vcf="tmp.vcf", | ||
log: | ||
"pytmb.log", | ||
params: | ||
extra="--verbose", | ||
wrapper: | ||
"master/bio/tmb/pytmb" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
## Describe the fields | ||
## For definition, provide the expected key:values | ||
## Note that several keys/values can be defined | ||
|
||
############################################### | ||
## SnpEff Parsing | ||
|
||
## Tags | ||
tag: 'ANN' | ||
sep: '&' | ||
|
||
## Annotation flags | ||
|
||
isCoding: | ||
1: | ||
- chromosome_number_variation | ||
- coding_sequence_variant | ||
- conservative_inframe_deletion | ||
- conservative_inframe_insertion | ||
- disruptive_inframe_deletion | ||
- disruptive_inframe_insertion | ||
- exon_loss | ||
- exon_loss_variant | ||
- exon_variant | ||
- frameshift_variant | ||
- gene_variant | ||
- initiator_codon_variant | ||
- missense_variant | ||
- rare_amino_acid_variant | ||
- splice_acceptor_variant | ||
- splice_donor_variant | ||
- splice_region_variant | ||
- start_lost | ||
- start_retained | ||
- stop_gained | ||
- stop_lost | ||
- stop_retained_variant | ||
- synonymous_variant | ||
- transcript_ablation | ||
- transcript_amplification | ||
- transcript_variant | ||
|
||
isNonCoding: | ||
1: | ||
- 3_prime_UTR_truncation | ||
- 3_prime_UTR_variant | ||
- 5_prime_UTR_premature_start_codon_gain_variant | ||
- 5_prime_UTR_truncation | ||
- 5_prime_UTR_variant | ||
- conserved_intergenic_variant | ||
- conserved_intron_variant | ||
- downstream_gene_variant | ||
- feature_elongation | ||
- feature_truncation | ||
- intergenic_region | ||
- intragenic_variant | ||
- intron_variant | ||
- mature_miRNA_variant | ||
- miRNA | ||
- NMD_transcript_variant | ||
- non_coding_transcript_exon_variant | ||
- non_coding_transcript_variant | ||
- regulatory_region_ablation | ||
- regulatory_region_amplification | ||
- regulatory_region_variant | ||
- TF_binding_site_variant | ||
- TFBS_ablation | ||
- TFBS_amplification | ||
- upstream_gene_variant | ||
|
||
isSplicing: | ||
1: | ||
- splice_donor_variant | ||
- splice_acceptor_variant | ||
- splice_region_variant | ||
|
||
isSynonymous: | ||
1: | ||
- start_retained_variant | ||
- stop_retained_variant | ||
- synonymous_variant | ||
|
||
isNonSynonymous: | ||
1: | ||
- frameshift_variant | ||
- missense_variant | ||
- rare_amino_acid_variant | ||
- splice_acceptor_variant | ||
- splice_donor_variant | ||
- splice_region_variant | ||
- start_lost | ||
- stop_gained | ||
- stop_lost | ||
|
||
## Databases | ||
cancerDb: | ||
cosmic: | ||
- cosmic_coding_ID | ||
- cosmic_noncoding_ID | ||
|
||
polymDb: | ||
1k: | ||
- kg_AMR_AF | ||
- kg_AFR_AF | ||
- kg_EAS_AF | ||
- kg_EUR_AF | ||
- kg_SAS_AF | ||
- KG_AF_GLOBAL | ||
|
||
gnomad: | ||
- gnomAD_genomes_AF | ||
- AF | ||
|
||
esp: | ||
- ESP_AF_GLOBAL |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
18 0 80373285 |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
## Describe the fields | ||
## For definition, provide the expected key:values | ||
## Note that several keys/values can be defined | ||
## | ||
############################################### | ||
|
||
freq: 'AF' | ||
depth: 'DP' | ||
altDepth: 'AD' | ||
maxVaf: '1' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# coding: utf-8 | ||
|
||
"""Snakemake wrapper for pyTMB.py""" | ||
|
||
__author__ = "Thibault Dayris" | ||
__mail__ = "thibault.dayris@gustaveroussy.fr" | ||
__copyright__ = "Copyright 2024, Thibault Dayris" | ||
__license__ = "MIT" | ||
|
||
from os.path import basename | ||
from re import sub | ||
from snakemake import shell | ||
from tempfile import TemporaryDirectory | ||
|
||
|
||
extra = snakemake.params.get("extra", "") | ||
ln_extra = "--symbolic --force --relative --verbose" | ||
|
||
out_vcf = snakemake.output.get("vcf", "") | ||
if out_vcf: | ||
extra += " --export" | ||
|
||
# pyTMB creates an exported VCF file which name/prefix | ||
# is predictible, but not editable. It is based on input | ||
# vcf file name. | ||
# It was chosen to handle this issue in the wrapper itself, | ||
# rather than expecting user to define `shadow` directive | ||
# in the Snakemake rule. | ||
with TemporaryDirectory() as tempdir: | ||
# Linking all input files in the creates temporary directory | ||
vcf_link_path = f"{tempdir}/{basename(snakemake.input.vcf)}" | ||
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True) | ||
shell("ln {ln_extra} {snakemake.input.vcf} {vcf_link_path} {log}") | ||
|
||
db_config = snakemake.input.get("db_config", "") | ||
if db_config: | ||
db_link_path = f"{tempdir}/{basename(db_config)}" | ||
shell("ln {ln_extra} {db_config} {db_link_path} {log}") | ||
db_config = f"--dbConfig {db_link_path}" | ||
|
||
var_config = snakemake.input.get("var_config", "") | ||
if var_config: | ||
var_link_path = f"{tempdir}/{basename(var_config)}" | ||
shell("ln {ln_extra} {var_config} {var_link_path} {log}") | ||
var_config = f"--varConfig {var_link_path}" | ||
|
||
bed = snakemake.input.get("bed", "") | ||
if bed: | ||
bed_link_path = f"{tempdir}/{basename(bed)}" | ||
shell("ln {ln_extra} {bed} {bed_link_path} {log}") | ||
bed = f"--bed {bed_link_path}" | ||
|
||
res_link_name = f"{tempdir}/{basename(snakemake.output.res)}" | ||
|
||
# Running pyTMB on symlinked files, after moving | ||
# into the temporary directory in order to let | ||
# the exported VCF file be there. | ||
# The exported VCF file is created in working directory. | ||
log = snakemake.log_fmt_shell(stdout=False, stderr=True, append=True) | ||
shell( | ||
"cd {tempdir} && " | ||
"pyTMB.py {extra} " | ||
"{db_config} {var_config} {bed} " | ||
"--vcf {vcf_link_path} " | ||
"> {res_link_name} " | ||
"{log} && " | ||
"cd - " | ||
) | ||
|
||
# Moving the main result file | ||
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True) | ||
shell("mv --verbose {res_link_name} {snakemake.output.res} {log}") | ||
|
||
# Moving the optional exported VCF file | ||
if out_vcf: | ||
prefix = sub("\.(v|b)cf(.gz)?", "", f"{vcf_link_path}") | ||
shell("mv --verbose {prefix}_export.vcf {out_vcf} {log}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters