Skip to content

Commit

Permalink
feat: add wrapper for coolpuppy (#554)
Browse files Browse the repository at this point in the history
<!-- Ensure that the PR title follows conventional commit style (<type>:
<description>)-->
<!-- Possible types are here:
https://github.com/commitizen/conventional-commit-types/blob/master/index.json
-->

### Description

Created a wrapper for coolpup.py (https://github.com/open2c/coolpuppy)

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] I confirm that:

For all wrappers added by this PR, 

* there is a test case which covers any introduced changes,
* `input:` and `output:` file paths in the resulting rule can be changed
arbitrarily,
* either the wrapper can only use a single core, or the example rule
contains a `threads: x` statement with `x` being a reasonable default,
* rule names in the test case are in
[snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell
what the rule is about or match the tools purpose or name (e.g.,
`map_reads` for a step that maps reads),
* all `environment.yaml` specifications follow [the respective best
practices](https://stackoverflow.com/a/64594513/2352071),
* wherever possible, command line arguments are inferred and set
automatically (e.g. based on file extensions in `input:` or `output:`),
* all fields of the example rules in the `Snakefile`s and their entries
are explained via comments (`input:`/`output:`/`params:` etc.),
* `stderr` and/or `stdout` are logged correctly (`log:`), depending on
the wrapped tool,
* temporary files are either written to a unique hidden folder in the
working directory, or (better) stored where the Python function
`tempfile.gettempdir()` points to (see
[here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir);
this also means that using any Python `tempfile` default behavior
works),
* the `meta.yaml` contains a link to the documentation of the respective
tool or command,
* `Snakefile`s pass the linting (`snakemake --lint`),
* `Snakefile`s are formatted with
[snakefmt](https://github.com/snakemake/snakefmt),
* Python wrapper scripts are formatted with
[black](https://black.readthedocs.io).
* Conda environments use a minimal amount of channels, in recommended
ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as
conda-forge should have highest priority and defaults channels are
usually not needed because most packages are in conda-forge nowadays).

Co-authored-by: Filipe G. Vieira <fgarrettvieira@gmail.com>
  • Loading branch information
Phlya and fgvieira committed Nov 2, 2022
1 parent 43e5a16 commit 60f1fc1
Show file tree
Hide file tree
Showing 9 changed files with 189 additions and 0 deletions.
6 changes: 6 additions & 0 deletions bio/coolpuppy/environment.yaml
@@ -0,0 +1,6 @@
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- coolpuppy
20 changes: 20 additions & 0 deletions bio/coolpuppy/meta.yaml
@@ -0,0 +1,20 @@
name: coolpup.py
description: Pileup features for a resolution in an .mcool file
url: https://github.com/open2c/coolpuppy
authors:
- Ilya Flyamer
input:
- a multiresolution cooler file (.mcool)
- a file with features to pileup
- (optional) file with expected
- (optional) view, a bed-style file with region coordinates and names to use for analysis
output:
- >
A file (.clpy, HDF5-based format) with the pileup.
Can have a {resolution} wildcard that specifies the resolution for the analysis,
then it doesn't need to be specified as a parameter.
params:
resolution: >
Optional, can be instead specified as a wildcard in the output
extra: Any additional arguments to pass
notes:
Binary file added bio/coolpuppy/test/CN.mm9.1000kb.mcool
Binary file not shown.
101 changes: 101 additions & 0 deletions bio/coolpuppy/test/CN.mm9.toy_expected.tsv
@@ -0,0 +1,101 @@
region1 region2 dist n_valid count.sum balanced.sum balanced.avg
foo foo 0 50
foo foo 1 49
foo foo 2 48 448255.0 2.3205076553434987 0.04834390948632289
foo foo 3 47 271497.0 1.38339695992966 0.02943397787084383
foo foo 4 46 179491.0 0.900655795691491 0.01957947381938024
foo foo 5 45 135426.0 0.6826130105698165 0.015169178012662588
foo foo 6 44 96841.0 0.48167647260294866 0.010947192559157925
foo foo 7 43 74458.0 0.36747755422094075 0.008545989633045134
foo foo 8 42 56431.0 0.2767897183400133 0.0065902313890479364
foo foo 9 41 46579.0 0.23020444753273792 0.005614742622749705
foo foo 10 40 42800.0 0.21407619204942857 0.005351904801235714
foo foo 11 39 38893.0 0.1931769021342914 0.0049532539008792665
foo foo 12 38 35915.0 0.1760485882134026 0.004632857584563227
foo foo 13 37 31507.0 0.15432815796541483 0.0041710312963625625
foo foo 14 36 28275.0 0.13916825128679033 0.003865784757966398
foo foo 15 35 26582.0 0.13214461875460215 0.0037755605358457756
foo foo 16 34 24080.0 0.1200420079045525 0.0035306472913103674
foo foo 17 33 22554.0 0.1123809167677425 0.0034054823262952274
foo foo 18 32 21069.0 0.10519693902501005 0.003287404344531564
foo foo 19 31 19565.0 0.09730388315158268 0.003138834940373635
foo foo 20 30 18830.0 0.09344118037915836 0.003114706012638612
foo foo 21 29 18180.0 0.09181365603513099 0.003165988139142448
foo foo 22 28 16817.0 0.0857312761411997 0.003061831290757132
foo foo 23 27 15637.0 0.08088906104487427 0.0029958911498101583
foo foo 24 26 13554.0 0.0696931607808895 0.0026805061838803654
foo foo 25 25 12151.0 0.062133968853916574 0.002485358754156663
foo foo 26 24 10641.0 0.053908741063492124 0.002246197544312172
foo foo 27 23 9371.0 0.04780835937733471 0.002078624320753683
foo foo 28 22 8684.0 0.04565538936132342 0.0020752449709692464
foo foo 29 21 7883.0 0.04194264489363847 0.0019972688044589747
foo foo 30 20 7602.0 0.04117335917285604 0.002058667958642802
foo foo 31 19 6783.0 0.03642786791651601 0.0019172562061324217
foo foo 32 18 6220.0 0.033609930607101324 0.0018672183670611847
foo foo 33 17 5752.0 0.03126540105125592 0.0018391412383091717
foo foo 34 16 5236.0 0.02870993254323146 0.0017943707839519663
foo foo 35 15 4806.0 0.026732726358511393 0.0017821817572340928
foo foo 36 14 4562.0 0.025516336044875902 0.0018225954317768502
foo foo 37 13 4484.0 0.025173064987642168 0.001936389614434013
foo foo 38 12 4322.0 0.024324300745100825 0.0020270250620917354
foo foo 39 11 3797.0 0.02095540632794532 0.00190503693890412
foo foo 40 10 3403.0 0.018630663941423948 0.0018630663941423947
foo foo 41 9 3044.0 0.016810995031025552 0.001867888336780617
foo foo 42 8 2716.0 0.015316241229781234 0.0019145301537226542
foo foo 43 7 2461.0 0.014124488058201323 0.002017784008314475
foo foo 44 6 2060.0 0.011782977088540664 0.0019638295147567774
foo foo 45 5 1629.0 0.009356770724295723 0.0018713541448591446
foo foo 46 4 1325.0 0.007777107004193509 0.0019442767510483773
foo foo 47 3 950.0 0.005574745304582236 0.0018582484348607453
foo foo 48 2 629.0 0.003669007156579109 0.0018345035782895544
foo foo 49 1 326.0 0.0020415942196967394 0.0020415942196967394
bar bar 0 49
bar bar 1 48
bar bar 2 47 450107.0 2.1180050802546933 0.04506393787775943
bar bar 3 46 238644.0 1.1182026520831783 0.02430875330615605
bar bar 4 45 151877.0 0.7065426657897472 0.01570094812866105
bar bar 5 44 105862.0 0.4889639900117408 0.01111281795481229
bar bar 6 43 84565.0 0.3886687958491317 0.00903880920579376
bar bar 7 42 67656.0 0.305587801420597 0.007275900033823738
bar bar 8 41 56605.0 0.2536802573536893 0.006187323350089984
bar bar 9 40 49125.0 0.21940452543596367 0.005485113135899092
bar bar 10 39 43256.0 0.19302073776471373 0.004949249686274711
bar bar 11 38 38908.0 0.17213966992023477 0.0045299913136903885
bar bar 12 37 33613.0 0.1494114335367291 0.00403814685234403
bar bar 13 36 29008.0 0.1286862020151156 0.0035746167226421
bar bar 14 35 28208.0 0.1257340707416353 0.0035924020211895802
bar bar 15 34 26130.0 0.11682046178278417 0.0034358959347877698
bar bar 16 33 24355.0 0.10848220502658447 0.0032873395462601354
bar bar 17 32 21902.0 0.09720413992092795 0.0030376293725289986
bar bar 18 31 19754.0 0.08921457365055102 0.00287788947259842
bar bar 19 30 17506.0 0.0798108423392565 0.00266036141130855
bar bar 20 29 16951.0 0.07831020324831016 0.002700351836148626
bar bar 21 28 16124.0 0.07470713314986098 0.0026681118982093206
bar bar 22 27 16237.0 0.07516147832181286 0.002783758456363439
bar bar 23 26 15583.0 0.07144738725071081 0.0027479764327196466
bar bar 24 25 14864.0 0.06801519019393452 0.0027206076077573808
bar bar 25 24 14174.0 0.06516873511627985 0.002715363963178327
bar bar 26 23 14169.0 0.06554949528961256 0.002849978056070111
bar bar 27 22 13561.0 0.06221042530718225 0.0028277466048719207
bar bar 28 21 12073.0 0.055813578961226296 0.0026577894743441094
bar bar 29 20 11032.0 0.05118868034313225 0.0025594340171566127
bar bar 30 19 10723.0 0.050269590871060296 0.0026457679405821207
bar bar 31 18 10646.0 0.04998712073522266 0.0027770622630679254
bar bar 32 17 10320.0 0.04943531274185869 0.002907959573050511
bar bar 33 16 9664.0 0.04604888783607321 0.0028780554897545755
bar bar 34 15 9227.0 0.04425307710295975 0.0029502051401973164
bar bar 35 14 9111.0 0.04421548066666439 0.0031582486190474567
bar bar 36 13 9923.0 0.04945120961837048 0.003803939201413114
bar bar 37 12 9219.0 0.04674824569212995 0.0038956871410108294
bar bar 38 11 8027.0 0.04077733321358686 0.0037070302921442602
bar bar 39 10 6756.0 0.03230495094148628 0.0032304950941486276
bar bar 40 9 5996.0 0.027699878189309274 0.003077764243256586
bar bar 41 8 5280.0 0.023833680900535406 0.0029792101125669258
bar bar 42 7 4560.0 0.019837282406156377 0.002833897486593768
bar bar 43 6 3911.0 0.01627847374007839 0.0027130789566797314
bar bar 44 5 3155.0 0.012966661266117605 0.002593332253223521
bar bar 45 4 2335.0 0.008792759755829107 0.0021981899389572766
bar bar 46 3 1518.0 0.005519380548429014 0.0018397935161430046
bar bar 47 2 1142.0 0.003471630969823881 0.0017358154849119406
bar bar 48 1 756.0 0.0019403909506671992 0.0019403909506671992
bar bar 49 0 361.0 0.0
2 changes: 2 additions & 0 deletions bio/coolpuppy/test/CN.mm9.toy_features.bed
@@ -0,0 +1,2 @@
chr1 100100000 100150000
chr2 100200000 100250000
2 changes: 2 additions & 0 deletions bio/coolpuppy/test/CN.mm9.toy_regions.bed
@@ -0,0 +1,2 @@
chr1 100000000 150000000 foo
chr2 100000000 150000000 bar
17 changes: 17 additions & 0 deletions bio/coolpuppy/test/Snakefile
@@ -0,0 +1,17 @@
rule coolpuppy:
input:
cooler="CN.mm9.1000kb.mcool", ## Multiresolution cooler file
features="CN.mm9.toy_features.bed", ## Feature file
expected="CN.mm9.toy_expected.tsv", ## Expected file
view="CN.mm9.toy_regions.bed", ## File with the region names and coordinates
output:
"CN_{resolution,[0-9]+}.clpy",
params:
## Add optional parameters
features_format="bed", ## Format of the features file
extra="--local", ## Add extra parameters
threads: 2
log:
"logs/CN_{resolution}_coolpuppy.log",
wrapper:
"master/bio/coolpuppy"
33 changes: 33 additions & 0 deletions bio/coolpuppy/wrapper.py
@@ -0,0 +1,33 @@
__author__ = "Ilya Flyamer"
__copyright__ = "Copyright 2022, Ilya Flyamer"
__email__ = "flyamer@gmail.com"
__license__ = "MIT"

from snakemake.shell import shell

## Extract arguments
view = snakemake.input.get("view", "")
if view:
view = f"--view {view}"
expected = snakemake.input.get("expected", "")
if expected:
expected = f"--expected {expected}"

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

resolution = snakemake.params.get("resolution", snakemake.wildcards.get("resolution"))
if not resolution:
raise ValueError("Please specify resolution either as a wildcard or as a parameter")

shell(
"(coolpup.py"
" {snakemake.input.cooler}::resolutions/{resolution}"
" {snakemake.input.features}"
" {expected}"
" --features-format {snakemake.params.features_format}"
" {view}"
" -p {snakemake.threads}"
" {extra}"
" -o {snakemake.output}) {log}"
)
8 changes: 8 additions & 0 deletions test.py
Expand Up @@ -1817,6 +1817,14 @@ def test_clustalo():
)


@skip_if_not_modified
def test_coolpuppy():
run(
"bio/coolpuppy",
["snakemake", "--cores", "1", "CN_1000000.clpy", "--use-conda", "-F"],
)


@skip_if_not_modified
def test_cooltools_insulation():
run(
Expand Down

0 comments on commit 60f1fc1

Please sign in to comment.