You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Snakemake version
6.12.3, seems to be introduced in version 6.5.1 by commit 4dbb7ad
Describe the bug
I have a checkpoint trimming, and a follow up align and sort groupjob. When I run this with 2 cores, all is well. However when I run it with 4 cores, what I think happens is: trimming 1 finishes -> re-evaluate DAG -> align+sort -> trimming 2 finishes -> re-evaluate DAG -> CRASH because align+sort output already exists but rule hasn't finished yet.
Minimal example
Running this with 2 cores works, but running with four cores causes a crash!! Note that it isn't guaranteed to happen, so you might need to re-run this once or twice..
snakemake --cores 4
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 4
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
-------- ------- ------------- -------------
align 2 1 1
all 1 1 1
sort 2 1 1
trimming 2 1 1
total 7 1 1
Select jobs to execute...
[Wed Jan 19 09:34:38 2022]
checkpoint trimming:
output: output/trimmed/1.txt
jobid: 3
wildcards: sample=1
resources: tmpdir=/tmp
Downstream jobs will be updated after completion.
[Wed Jan 19 09:34:38 2022]
checkpoint trimming:
output: output/trimmed/2.txt
jobid: 6
wildcards: sample=2
resources: tmpdir=/tmp
Downstream jobs will be updated after completion.
[Wed Jan 19 09:34:39 2022]
Finished job 3.
1 of 7 steps (14%) done
Updating job align.
Select jobs to execute...
[Wed Jan 19 09:34:39 2022]
group job bdb68102-7e52-4fa6-ac1d-6c3eb711d5fd (jobs in lexicogr. order):
[Wed Jan 19 09:34:39 2022]
rule align:
input: output/trimmed/1.txt
output: output/aligned/1.txt (pipe)
jobid: 2
wildcards: sample=1
resources: tmpdir=/tmp
[Wed Jan 19 09:34:39 2022]
rule sort:
input: output/aligned/1.txt
output: output/aligned_and_sort/1.txt
jobid: 1
wildcards: sample=1
resources: tmpdir=/tmp
[Wed Jan 19 09:34:39 2022]
Finished job 6.
2 of 7 steps (29%) done
Updating job align.
Select jobs to execute...
[Wed Jan 19 09:34:39 2022]
group job bdb68102-7e52-4fa6-ac1d-6c3eb711d5fd (jobs in lexicogr. order):
[Wed Jan 19 09:34:39 2022]
rule align:
input: output/trimmed/1.txt
output: output/aligned/1.txt (pipe)
jobid: 2
wildcards: sample=1
resources: tmpdir=/tmp
[Wed Jan 19 09:34:39 2022]
rule sort:
input: output/aligned/1.txt
output: output/aligned_and_sort/1.txt
jobid: 1
wildcards: sample=1
resources: tmpdir=/tmp
Warning: the following output files of rule align were not present when the DAG was created:
{'output/aligned/1.txt'}
Warning: the following output files of rule sort were not present when the DAG was created:
{'output/aligned_and_sort/1.txt'}
[Wed Jan 19 09:34:41 2022]
Finished job 2.
[Wed Jan 19 09:34:41 2022]
Finished job 1.
4 of 7 steps (57%) done
Select jobs to execute...
[Wed Jan 19 09:34:41 2022]
group job bdb68102-7e52-4fa6-ac1d-6c3eb711d5fd (jobs in lexicogr. order):
[Wed Jan 19 09:34:41 2022]
rule align:
input: output/trimmed/2.txt
output: output/aligned/2.txt (pipe)
jobid: 5
wildcards: sample=2
resources: tmpdir=/tmp
[Wed Jan 19 09:34:41 2022]
rule sort:
input: output/aligned/2.txt
output: output/aligned_and_sort/2.txt
jobid: 4
wildcards: sample=2
resources: tmpdir=/tmp
[Wed Jan 19 09:34:41 2022]
Error in group job bdb68102-7e52-4fa6-ac1d-6c3eb711d5fd:
[Wed Jan 19 09:34:41 2022]
Error in rule sort:
jobid: 1
output: output/aligned_and_sort/1.txt
shell:
touch output/aligned_and_sort/1.txt; sleep 1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Wed Jan 19 09:34:41 2022]
Error in rule align:
jobid: 2
output: output/aligned/1.txt (pipe)
shell:
touch output/aligned/1.txt; sleep 1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job sort since they might be corrupted:
output/aligned_and_sort/1.txt
Traceback (most recent call last):
File "/home/sande/miniconda3/envs/seq2science/lib/python3.8/site-packages/snakemake/__init__.py", line 699, in snakemake
success = workflow.execute(
File "/home/sande/miniconda3/envs/seq2science/lib/python3.8/site-packages/snakemake/workflow.py", line 1073, in execute
success = self.scheduler.schedule()
File "/home/sande/miniconda3/envs/seq2science/lib/python3.8/site-packages/snakemake/scheduler.py", line 441, in schedule
self._error_jobs()
File "/home/sande/miniconda3/envs/seq2science/lib/python3.8/site-packages/snakemake/scheduler.py", line 557, in _error_jobs
self._handle_error(job)
File "/home/sande/miniconda3/envs/seq2science/lib/python3.8/site-packages/snakemake/scheduler.py", line 615, in _handle_error
self.running.remove(job)
KeyError: JobGroup(bdb68102-7e52-4fa6-ac1d-6c3eb711d5fd,frozenset({sort, align}))
The text was updated successfully, but these errors were encountered:
…he same group id to different groups; bug that accidentally added already running groups of the list of ready jobs (issue #1331) (#1332)
* issue 1331
* Update Snakefile
* Update Snakefile
* fix: bug in pipe group handling that led to multiple assignments of the same group id to different groups; bug that accidentally added already running groups of the list of ready jobs
* fmt
* skip on win
Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
Snakemake version
6.12.3, seems to be introduced in version 6.5.1 by commit 4dbb7ad
Describe the bug
I have a checkpoint trimming, and a follow up align and sort groupjob. When I run this with 2 cores, all is well. However when I run it with 4 cores, what I think happens is: trimming 1 finishes -> re-evaluate DAG -> align+sort -> trimming 2 finishes -> re-evaluate DAG -> CRASH because align+sort output already exists but rule hasn't finished yet.
Minimal example
Running this with 2 cores works, but running with four cores causes a crash!! Note that it isn't guaranteed to happen, so you might need to re-run this once or twice..
output
The text was updated successfully, but these errors were encountered: