New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange behavior when using checkpoints in cloud execution #2021
Comments
Hi @cademirch , I'm trying to adapt a test from our test suite so that it contains some of the functionality of your example. How's this? Unlike the original test, I've added a directory output and aggregated over files in the directory. If this doesn't reproduce the issues you've been seeing, then we can always go back and try something else - but at least we'll have ruled this out.
|
Thanks for putting this test file together @aryarm. Sorry for the delay, but I've run it and this also almost replicates the behavior above. Here is the output from the GLS log of the execution of the
So, despite this being the execution of So I think this boils down to That being said, I'm not sure what the solution is here. Perhaps somehow Snakemake needs to know to bring the Sidenote: There is a
Solves this error. |
So are there any other errors that you're getting that this minimal example doesn't replicate?
I'm not sure what you mean here. Is
Maybe we should start by tackling the duplicated prefix issue. In theory, we shouldn't have to change our I've started on a branch that we can draft a PR for. I'll create the PR once we're confident that it covers all of the weird behavior you've been seeing. |
Other than the
Sorry, this is poor explanation on my part. IIRC
I'm not super sure about this actually. I believe that the duplicated prefix here is because in your Thanks for following up on this! |
Ok. Then I'm going to move forward with that example as a test for the pull request.
Yes, I agree - that's definitely what is happening here. But I'm still not sure that it should. Thanks for further explaining the point about |
ok, we can continue working on this in PR #2108 @cademirch , let me know if you have any suggestions on changes to my description of the issue or the code I committed! |
@aryarm Thank you for putting together the PR, this is very helpful! Hopefully we can get to the bottom of this. |
Closed because this is covered in the docs example 🤦♂️ |
Snakemake version
7.18.2
Describe the bug
I am trying to run a workflow with a checkpoint in the cloud. The workflow runs as expected when executed locally. However, in the cloud (using
--google-lifesciences
) it seems that on the checkpoint aggregating step (see below), the whole DAG is rebuilt within that cloud job.Minimal example
run using:
snakemake --google-lifesciences --default-remote-prefix cade-test -j 10
This is the stdout/stderr from the aggregate_step job, notice how it rebuilds the DAG
Additional context
@johanneskoester Would appreciate your insight on this. Am I doing something really wrong or not using checkpoints as intended?
The text was updated successfully, but these errors were encountered: