Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Added meryl count * Added meryl union * Added test * Added meryl stats * Made union wrapper more general * Fixed docs and tests * Cleanup * Update bio/meryl/count/environment.yaml Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
- Loading branch information
1 parent
c27be68
commit f5ddac1
Showing
272 changed files
with
260 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
channels: | ||
- conda-forge | ||
- bioconda | ||
- nodefaults | ||
dependencies: | ||
- meryl =1.3 | ||
- snakemake-wrapper-utils =0.4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
name: meryl count | ||
description: | | ||
A genomic k-mer counter (and sequence utility) with nice features. | ||
url: https://github.com/marbl/meryl | ||
authors: | ||
- Filipe G. Vieira | ||
input: | ||
- fasta file | ||
output: | ||
- meryl database | ||
notes: | | ||
* The `command` param allows to specify how to count the kmers: `count` (canonical kmers) [default], `count-forward` (only forward kmers), or `count-reverse` (only reverse kmers). | ||
* The `extra` param allows for additional program arguments (kmer size `k` is mandatory). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
rule meryl_count: | ||
input: | ||
fasta="{genome}.fasta", | ||
output: | ||
directory("{genome}/"), | ||
log: | ||
"logs/meryl_count/{genome}.log", | ||
params: | ||
command="count", | ||
extra="k=32", | ||
threads: 2 | ||
resources: | ||
mem_mb=2048, | ||
wrapper: | ||
"master/bio/meryl/count" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
>Sheila | ||
GCTAGCTCAGAAAAAAAAAAGATGCGAGGCGTAGGCGATGCGATCGATCGATCTATAGGCTCGAGGCTAGGGCTAGCTGA |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
__author__ = "Filipe G. Vieira" | ||
__copyright__ = "Copyright 2022, Filipe G. Vieira" | ||
__license__ = "MIT" | ||
|
||
|
||
from snakemake.shell import shell | ||
from snakemake_wrapper_utils.snakemake import get_mem | ||
|
||
|
||
extra = snakemake.params.get("extra", "") | ||
log = snakemake.log_fmt_shell(stdout=True, stderr=True) | ||
|
||
|
||
command = snakemake.params.get("command", "count") | ||
assert command in [ | ||
"count", | ||
"count-forward", | ||
"count-reverse", | ||
], "invalid command specified." | ||
|
||
|
||
mem_gb = get_mem(snakemake, out_unit="GiB") | ||
|
||
|
||
shell( | ||
"meryl" | ||
" {command}" | ||
" threads={snakemake.threads}" | ||
" memory={mem_gb}" | ||
" {extra}" | ||
" {snakemake.input}" | ||
" output {snakemake.output}" | ||
" {log}" | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
channels: | ||
- bioconda | ||
- conda-forge | ||
- defaults | ||
dependencies: | ||
- meryl =1.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
name: meryl sets | ||
description: | | ||
A genomic k-mer counter (and sequence utility) with nice features. | ||
url: https://github.com/marbl/meryl | ||
authors: | ||
- Filipe G. Vieira | ||
input: | ||
- meryl database(s) | ||
output: | ||
- meryl database | ||
notes: | | ||
* The `command` param allows to specify how to handle the kmer sets: `union` (number of inputs) [default], `union-min` (union with minimum count), `union-max` (union with maximum count), `union-sum` (union with sum of the counts), `intersect` (intersect with counts in the first input), `intersect-min` (intersect with minimum count), `intersect-max` (intersect with maximum count), `intersect-sum` (intersect with sum of counts), `subtract` (counts from first input, subtracting counts from the other inputs), `difference` (counts from first input, but none of the other inputs), or `symmetric-difference` (exactly one input). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
rule meryl_union: | ||
input: | ||
"{genome}", | ||
"{genome}", | ||
output: | ||
directory("{genome}_union/"), | ||
log: | ||
"logs/{genome}.union.log", | ||
params: | ||
command="union-sum", | ||
wrapper: | ||
"master/bio/meryl/sets" | ||
|
||
|
||
rule meryl_intersect: | ||
input: | ||
"{genome}", | ||
"{genome}", | ||
output: | ||
directory("{genome}_intersect/"), | ||
log: | ||
"logs/{genome}.intersect.log", | ||
params: | ||
command="intersect-max", | ||
wrapper: | ||
"master/bio/meryl/sets" | ||
|
||
|
||
rule meryl_subtract: | ||
input: | ||
"{genome}", | ||
"{genome}", | ||
output: | ||
directory("{genome}_subtract/"), | ||
log: | ||
"logs/{genome}.subtract.log", | ||
params: | ||
command="subtract", | ||
wrapper: | ||
"master/bio/meryl/sets" | ||
|
||
|
||
rule meryl_difference: | ||
input: | ||
"{genome}", | ||
"{genome}", | ||
output: | ||
directory("{genome}_difference/"), | ||
log: | ||
"logs/{genome}.difference.log", | ||
params: | ||
command="difference", | ||
wrapper: | ||
"master/bio/meryl/sets" |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
__author__ = "Filipe G. Vieira" | ||
__copyright__ = "Copyright 2022, Filipe G. Vieira" | ||
__license__ = "MIT" | ||
|
||
|
||
from snakemake.shell import shell | ||
|
||
|
||
log = snakemake.log_fmt_shell(stdout=True, stderr=True) | ||
|
||
|
||
command = snakemake.params.get("command", "union") | ||
assert command in [ | ||
"union", | ||
"union-min", | ||
"union-max", | ||
"union-sum", | ||
"intersect", | ||
"intersect-min", | ||
"intersect-max", | ||
"intersect-sum", | ||
"subtract", | ||
"difference", | ||
"symmetric-difference", | ||
], "invalid command specified." | ||
|
||
|
||
shell("meryl {command} {snakemake.input} output {snakemake.output} {log}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
channels: | ||
- bioconda | ||
- conda-forge | ||
- defaults | ||
dependencies: | ||
- meryl =1.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
name: meryl stats | ||
description: | | ||
A genomic k-mer counter (and sequence utility) with nice features. | ||
url: https://github.com/marbl/meryl | ||
authors: | ||
- Filipe G. Vieira | ||
input: | ||
- meryl database(s) | ||
output: | ||
- meryl stats (either the kmers, statistics, or histogram) | ||
notes: | | ||
* The `command` param allows to specify which stats to print: `statistics` (display total, unique, distinct kmers) [default], `histogram` (display kmer frequency), or `print` (display kmers). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
rule meryl_stats: | ||
input: | ||
"{genome}", | ||
output: | ||
"{genome}.stats", | ||
log: | ||
"logs/meryl_stats/{genome}.log", | ||
params: | ||
command="statistics", | ||
wrapper: | ||
"master/bio/meryl/stats" |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
__author__ = "Filipe G. Vieira" | ||
__copyright__ = "Copyright 2022, Filipe G. Vieira" | ||
__license__ = "MIT" | ||
|
||
|
||
from snakemake.shell import shell | ||
|
||
|
||
log = snakemake.log_fmt_shell(stdout=False, stderr=True) | ||
|
||
|
||
command = snakemake.params.get("command", "statistics") | ||
assert command in [ | ||
"statistics", | ||
"histogram", | ||
"print", | ||
], "invalid command specified." | ||
|
||
|
||
shell("meryl {command} {snakemake.input} > {snakemake.output} {log}") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters