Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add rerun-params-changed command line option #1107

Closed

Conversation

timtroendle
Copy link
Contributor

@timtroendle timtroendle commented Jul 26, 2021

Description

This command line options allows to automatically rerun all rules for which parameters have changed.

Fixes #976.

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

This commandline options allows to automatically rerun all rules for which parameters have changed.

Fixes snakemake#976.
@timtroendle timtroendle changed the title Add rerun-params-changed command line option feat: Add rerun-params-changed command line option Jul 26, 2021
@timtroendle
Copy link
Contributor Author

This PR lacks the test folder ./tests/test_rerun_params_changed as it is ignored by ./.gitignore. I am not sure why that's the case, but I have it locally and I can force add it.

@johanneskoester
Copy link
Contributor

This PR lacks the test folder ./tests/test_rerun_params_changed as it is ignored by ./.gitignore. I am not sure why that's the case, but I have it locally and I can force add it.

Yes, you need to force add it. The .gitignore needs some refinement.

Copy link
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this. We already have --list-params-changes, which can be used with snakemake -R $(snakemake --list-params-changes). Is this PR to add more convenience?

@timtroendle
Copy link
Contributor Author

Ok. I've force added the test folder.

This option is about convenience, but not only that. snakemake -R $(snakemake --list-params-changes) runs all rules whose params have changed. snakemake --rerun-params-changed runs all those rules, but also rules that need to run for other reasons. Therefore, it's an option one can use in each execution of Snakemake. In my workflow, I don't see a reason why I would not have this option enabled all the time.

@johanneskoester
Copy link
Contributor

Ok. I've force added the test folder.

This option is about convenience, but not only that. snakemake -R $(snakemake --list-params-changes) runs all rules whose params have changed.

Actually not, -R only forces the given files, the rest of the workflow is executed as well if necessary. Nevertheless, I agree that it might be a reasonable functionality for convenience, for some even a good default.

@johanneskoester
Copy link
Contributor

Still needs formatting with black.

@sonarcloud
Copy link

sonarcloud bot commented Aug 23, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@sonarcloud
Copy link

sonarcloud bot commented Sep 24, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Contributor

@johanneskoester johanneskoester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought about this further, and I think we should change this into a flag --rerun-changed, which would include apart from params also input file changes and code changes (using the code that is already there for --list-code-changes and --list-input-changes). Would you like to modify this accordingly?

@timtroendle
Copy link
Contributor Author

Sounds good to me and yes, I can do that but will need some time.

@pmginacio
Copy link

I would like to join this discussion to suggest the follwing.

I used extensively other workflow managers in the past and I started using snakemake recently. I miss that snakemake triggers rules when inputs, code or parameters change, so I believe this pull request is certainly very useful for me.

I agree with '--rerun-params-changed' and '--rerun-changed' as they would be very welcome additions.

I often make changes to existing workflows and I would like to rely on snakemake to detect these changes and ensure that the outputs are up-to-date. However, once in a while, I make some changes which I know are inconsequential, or for some other reason I do not want to recompute all the dependencies of those changes.

Assuming that a flag named '--rerun-changes' implies actually running the rules, I believe a flag '--trigger-changed' (also possible '--trigger-params-changed') would be more generally useful. Such a flag would simply modify the decision algorithm to trigger rules whose inputs/params/code changed.

Consider the following case which I find myself often,
I have changed a parameter/input/code for a given rule. Before actually running anything, first I would like to,
a) preview which files will be recomputed, For this, a combination of --trigger-changes and --dry-run would suffice. --trigger-changes would pick up on all the out-of-date rules and --dry-run would work as usual.
b) decide whether to actually run,
c) or simply touch the files marked for recalculation, which could be accomplished with a combination of '--trigger-changes' and '--touch'

It is possible that what I propose is exactly what was meant with '--rerun-changes', and in that case I simply propose a semantic (different name) change to '--trigger-changes' which does not imply running any rules.

Thanks for the great work.

@timtroendle
Copy link
Contributor Author

Hi @pmginacio, thanks for your comment. My intention is to solve the exact problem you are describing -- good to hear that others are facing this too.

If I understand you correctly, a lot of what you are describing is already possible today. --list-params-changes lists all rules whose parameter values have changed and therefore it is close to your --trigger-params-changes --dryrun, right? There are similar arguments for code and input changes, too.

What's missing right now is the possibility to always rerun rules with params/code/input changes, and that's what this PR is for.

Is my assumption correct? Would this be of help for you?

@johanneskoester
Copy link
Contributor

johanneskoester commented Oct 25, 2021

@pmginacio we are indeed all on the same page. I get your point about the trigger naming. However, Snakemake has other rerun-* parameters that work exactly as you describe in combination with dryrun or touch, so I think naming it rerun-* should be fine (as a dryrun e.g. is also a run in Snakemake world, just not one that actually runs the executable code of the rules).

We could of course rename all of these parameters, but that would be an unnecessary breaking change in the interface that I'd like to avoid. Hence, I'd rather stick with the current naming scheme for consististency.

@pmginacio
Copy link

@timtroendle glad to hear from you. Your assumption is correct, I find it useful to have a flag that always triggers rules for changes in params/code/input

@johanneskoester I agree, better to be consistent.

Thank you for your feedback, keep up the good work.

@johanneskoester
Copy link
Contributor

I have now started to work on this, and actually decided to make these reruns the default. Using --touch, one always has a way to force ignoring such triggers. Hence, closing this in favor of #1663.

@johanneskoester
Copy link
Contributor

Thanks a lot for your help here and the initial work!

@timtroendle
Copy link
Contributor Author

I am glad to hear that. Sorry for never finding the time to finish this. I am looking forward to testing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Command line option to always rerun with updated code or parameters
3 participants