New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to create files a second time in a single snakemake run #884
Comments
I finally tracked down the bug! In short, I forgot to correctly specify input files from a # Snakefile
outdir = "out"
count_file = f"{outdir}/count"
part_file = f"{outdir}/part-{{n}}.txt"
merged_file = f"{outdir}/all.txt"
def get_count():
try:
with open(count_file) as cf:
return int(cf.read())
except FileNotFoundError:
# this makes sure the DAG is connected before
# `make_count` has been executed
return 1
rule __default__:
input: merged_file, count_file
checkpoint make_count:
output: count_file
shell:
with open(str(output), "w") as cf:
from random import randint
print(randint(2, 10), file=cf)
rule make_part:
output: touch(part_file)
rule merge_parts:
input:
# NOTE missing `count_file` here
lambda _: [part_file.format(n=n) for n in range(get_count())]
output: protected(merged_file)
shell: "cat {input} > {output}" Produce the error with these commands: snakemake -j1
# this is only required in this example; I don't know how the mechanics in the original issue are
touch out/part-0.txt
snakemake -j1 This is the closest I could get. Here is the explanation of what happens:
Since I originally did not call |
- `propagate_mask_back_to_reference` did not declare its dependency on the reads DB which caused the Snakemake DAG to get mangled - the issue was reported and resolved at snakemake/snakemake#884
Snakemake version
Snakemake 5.32.1 powered by Python 3.9.1 on two different machines:
Describe the bug
The bug appears only on my latop and not on our cluster even though the environment are pretty similar.
The problem is that after running local rule
make_merge_config
snakemake suddenly wants to re-run the already finished checkpoint rulecollect
. As expected, Snakemake then crashes with anProtectedOutputException
because the output ofcollect
is protected. But it should not runcollect
at that point because the outputworkdir/pile-ups.db
is already up-to-date.After lifting the protection from all outputs (
chmod -R +w workdir
) and restarting the workflow, it takes up operation atcollect
and finishes as expected.Logs
Minimal example
It is very hard to reduce the 1500 lines Snakefile to a minimal example but I can share a small example of my compelete workflow (431M). Please follow the instructions in the included
README.md
. It takes 5 minutes until the error on my Laptop with an Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz. It will require a few Gb of memory. Sorry.Additional context
This might be related to #823.
The text was updated successfully, but these errors were encountered: