Skip to content

Commit

Permalink
feat: adding default_target directive for declaring default target ru…
Browse files Browse the repository at this point in the history
…les that are not the first rule in the workflow. (#1358)

* feat: adding default_target directive for declaring default target rules that are not the first rule in the workflow.

* fmt

* fix check

* Update tests.py
  • Loading branch information
johanneskoester committed Jan 29, 2022
1 parent c9be764 commit 638ec1a
Show file tree
Hide file tree
Showing 12 changed files with 102 additions and 20 deletions.
34 changes: 28 additions & 6 deletions docs/snakefiles/deployment.rst
Expand Up @@ -103,7 +103,7 @@ For example, we can easily add another rule to extend the given workflow:
github("snakemake-workflows/dna-seq-gatk-variant-calling", path="workflow/Snakefile", tag="v2.0.1")
config: config
use rule * from dna_seq
use rule * from dna_seq as dna_seq_*
# easily extend the workflow
rule plot_vafs:
Expand All @@ -114,7 +114,19 @@ For example, we can easily add another rule to extend the given workflow:
notebook:
"notebooks/plot-vafs.py.ipynb"
Moreover, it is possible to further extend the workflow with other modules, thereby generating an integrative analysis.
# Define a new default target that collects both the targets from the dna_seq module as well as
# the new plot.
rule all:
input:
rules.dna_seq_all.input,
"results/plots/vafs.svg",
default_target: True
Above, we have added a prefix to all rule names of the dna_seq module, such that there is no name clash with the added rules (``as dna_seq_*`` in the ``use rule`` statement).
In addition, we have added a new rule ``all``, defining the default target in case the workflow is executed (as usually) without any specific target files or rule.
The new target rule collects both all input files of the rule ``all`` from the dna_seq workflow, as well as additionally collecting the new plot.

It is possible to further extend the workflow with other modules, thereby generating an integrative analysis.
Here, let us assume that we want to conduct another kind of analysis, say RNA-seq, using a different external workflow.
We can extend above example in the following way:

Expand Down Expand Up @@ -149,10 +161,20 @@ We can extend above example in the following way:
use rule * from rna_seq as rna_seq_*
Above, several things have changed. First, we have added another module ``rna_seq``.
Second, we have added a prefix to all rule names of both modules (``dna_seq_*`` and ``rna_seq_*`` in the ``use rule`` statements) in order to avoid rule name clashes.
Third, we have added a prefix to all non-absolute input and output file names of both modules (``prefix: "dna-seq"`` and ``prefix: "rna-seq"``) in order to avoid file name clashes.
Finally, we provide the config of the two modules via two separate sections in the common config file (``config["dna-seq"]`` and ``config["rna-seq"]``).
# Define a new default target that collects all the targets from the dna_seq and rna_seq module.
rule all:
input:
rules.dna_seq_all.input,
rules.rna_seq_all.input,
default_target: True
Above, several things have changed.

* First, we have added another module ``rna_seq``.
* Second, we have added a prefix to all non-absolute input and output file names of both modules (``prefix: "dna-seq"`` and ``prefix: "rna-seq"``) in order to avoid file name clashes.
* Third, we have added a default target rule that collects both the default targets from the module ``dna_seq`` as well as the module ``rna_seq``.
* Finally, we provide the config of the two modules via two separate sections in the common config file (``config["dna-seq"]`` and ``config["rna-seq"]``).

----------------------------------
Uploading workflows to WorkflowHub
Expand Down
18 changes: 16 additions & 2 deletions docs/snakefiles/rules.rst
Expand Up @@ -246,11 +246,25 @@ By default snakemake executes the first rule in the snakefile. This gives rise t
.. code-block:: python
rule all:
input: ["{dataset}/file.A.txt".format(dataset=dataset) for dataset in DATASETS]
input:
expand("{dataset}/file.A.txt", dataset=DATASETS)
Here, for each dataset in a python list ``DATASETS`` defined before, the file ``{dataset}/file.A.txt`` is requested.
In this example, Snakemake recognizes automatically that these can be created by multiple applications of the rule ``complex_conversion`` shown above.

It is possible to overwrite this behavior to use the first rule as a default target, by explicitly marking a rule as being the default target via the ``default_target`` directive:

Here, for each dataset in a python list ``DATASETS`` defined before, the file ``{dataset}/file.A.txt`` is requested. In this example, Snakemake recognizes automatically that these can be created by multiple applications of the rule ``complex_conversion`` shown above.
.. code-block:: python
rule xy:
input:
expand("{dataset}/file.A.txt", dataset=DATASETS)
default_target: True
Regardless of where this rule appears in the Snakefile, it will be the default target.
Usually, it is still recommended to keep the default target rule (and in fact all other rules that could act as optional targets) at the top of the file, such that it can be easily found.
The ``default_target`` directive becomes particularly useful when :ref:`combining several pre-existing workflows <use_with_modules>`.

.. _snakefiles-threads:

Expand Down
4 changes: 3 additions & 1 deletion snakemake/__init__.py
Expand Up @@ -591,7 +591,9 @@ def snakemake(
success = True

workflow.include(
snakefile, overwrite_first_rule=True, print_compilation=print_compilation
snakefile,
overwrite_default_target=True,
print_compilation=print_compilation,
)
workflow.check()

Expand Down
2 changes: 1 addition & 1 deletion snakemake/dag.py
Expand Up @@ -622,7 +622,7 @@ def unneeded_files():
and not job.is_checkpoint
and (
job not in self.targetjobs
or job.rule.name == self.workflow.first_rule
or job.rule.name == self.workflow.default_target
)
):
tempfiles = (
Expand Down
2 changes: 1 addition & 1 deletion snakemake/modules.py
Expand Up @@ -86,7 +86,7 @@ def use_rules(self, rules=None, name_modifier=None, ruleinfo=None):
prefix=self.prefix,
replace_wrapper_tag=self.get_wrapper_tag(),
):
self.workflow.include(snakefile, overwrite_first_rule=True)
self.workflow.include(snakefile, overwrite_default_target=True)

def get_snakefile(self):
if self.meta_wrapper:
Expand Down
7 changes: 7 additions & 0 deletions snakemake/parser.py
Expand Up @@ -484,6 +484,12 @@ def keyword(self):
return "cache_rule"


class DefaultTarget(RuleKeywordState):
@property
def keyword(self):
return "default_target_rule"


class Handover(RuleKeywordState):
pass

Expand Down Expand Up @@ -673,6 +679,7 @@ def args(self):
group=Group,
cache=Cache,
handover=Handover,
default_target=DefaultTarget,
)


Expand Down
1 change: 1 addition & 0 deletions snakemake/ruleinfo.py
Expand Up @@ -37,6 +37,7 @@ def __init__(self, func=None):
self.cache = False
self.path_modifier = None
self.handover = False
self.default_target = False

def apply_modifier(
self, modifier, prefix_replacables={"input", "output", "log", "benchmark"}
Expand Down
36 changes: 27 additions & 9 deletions snakemake/workflow.py
Expand Up @@ -155,7 +155,7 @@ def __init__(
self.global_resources["_nodes"] = nodes

self._rules = OrderedDict()
self.first_rule = None
self.default_target = None
self._workdir = None
self.overwrite_workdir = overwrite_workdir
self.workdir_init = os.path.abspath(os.curdir)
Expand Down Expand Up @@ -466,8 +466,8 @@ def add_rule(
self._rules[rule.name] = rule
if not is_overwrite:
self.rule_count += 1
if not self.first_rule:
self.first_rule = rule.name
if not self.default_target:
self.default_target = rule.name
return name

def is_rule(self, name):
Expand Down Expand Up @@ -644,7 +644,9 @@ def files(items):
return map(relpath, filterfalse(self.is_rule, items))

if not targets:
targets = [self.first_rule] if self.first_rule is not None else list()
targets = (
[self.default_target] if self.default_target is not None else list()
)

if prioritytargets is None:
prioritytargets = list()
Expand Down Expand Up @@ -1148,7 +1150,7 @@ def containerize(self):
def include(
self,
snakefile,
overwrite_first_rule=False,
overwrite_default_target=False,
print_compilation=False,
overwrite_shellcmd=None,
):
Expand All @@ -1164,7 +1166,7 @@ def include(
self.included.append(snakefile)
self.included_stack.append(snakefile)

first_rule = self.first_rule
default_target = self.default_target
code, linemap, rulecount = parse(
snakefile,
self,
Expand All @@ -1185,8 +1187,8 @@ def include(

exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)

if not overwrite_first_rule:
self.first_rule = first_rule
if not overwrite_default_target:
self.default_target = default_target
self.included_stack.pop()

def onstart(self, func):
Expand Down Expand Up @@ -1558,11 +1560,20 @@ def decorate(ruleinfo):
self.cache_rules.add(rule.name)
elif not (ruleinfo.cache is False):
raise WorkflowError(
"Invalid argument for 'cache:' directive. Only true allowed. "
"Invalid argument for 'cache:' directive. Only True allowed. "
"To deactivate caching, remove directive.",
rule=rule,
)

if ruleinfo.default_target is True:
self.default_target = rule.name
elif not (ruleinfo.default_target is False):
raise WorkflowError(
"Invalid argument for 'default_target:' directive. Only True allowed. "
"Do not use the directive for rules that shall not be the default target. ",
rule=rule,
)

ruleinfo.func.__name__ = "__{}".format(rule.name)
self.globals[ruleinfo.func.__name__] = ruleinfo.func

Expand Down Expand Up @@ -1623,6 +1634,13 @@ def decorate(ruleinfo):

return decorate

def default_target_rule(self, value):
def decorate(ruleinfo):
ruleinfo.default_target = value
return ruleinfo

return decorate

def message(self, message):
def decorate(ruleinfo):
ruleinfo.message = message
Expand Down
11 changes: 11 additions & 0 deletions tests/test_default_target/Snakefile
@@ -0,0 +1,11 @@
rule a:
output:
"{sample}.txt"
shell:
"echo test > {output}"


rule b:
input:
expand("{sample}.txt", sample=[1, 2])
default_target: True
1 change: 1 addition & 0 deletions tests/test_default_target/expected-results/1.txt
@@ -0,0 +1 @@
test
1 change: 1 addition & 0 deletions tests/test_default_target/expected-results/2.txt
@@ -0,0 +1 @@
test
5 changes: 5 additions & 0 deletions tests/tests.py
Expand Up @@ -1399,5 +1399,10 @@ def test_conda_named():
run(dpath("test_conda_named"), use_conda=True)


@skip_on_windows
def test_default_target():
run(dpath("test_default_target"))


def test_cache_multioutput():
run(dpath("test_cache_multioutput"), shouldfail=True)

0 comments on commit 638ec1a

Please sign in to comment.