Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: trigger rerun when new files are yield by an input function #1462

Closed

Conversation

cclienti
Copy link
Contributor

@cclienti cclienti commented Mar 7, 2022

Description

Added the "missing" input flag to trigger rerun when new files are yield by an input function. Such flag is forwarded to all elements yield by the input function.

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

@cclienti cclienti force-pushed the fix/update-rule-missing-file branch from 016f8b3 to 057d01a Compare March 7, 2022 14:26
@cclienti cclienti changed the title Added missing new files in reason.updated_input during DAG construction fix: added missing input files in reason.updated_input in dag.py Mar 7, 2022
@cclienti cclienti force-pushed the fix/update-rule-missing-file branch 2 times, most recently from 326bf7c to 6771da4 Compare March 8, 2022 07:39
@cclienti cclienti marked this pull request as draft March 8, 2022 14:37
snakemake/dag.py Outdated Show resolved Hide resolved
@cclienti cclienti changed the title fix: added missing input files in reason.updated_input in dag.py feat: trigger rerun when new files are yield by an input function Mar 8, 2022
@cclienti cclienti force-pushed the fix/update-rule-missing-file branch 6 times, most recently from f568d54 to 72a1c17 Compare March 9, 2022 13:38
@cclienti cclienti marked this pull request as ready for review March 9, 2022 13:40
Added the "missing" input flags to trigger rerun when new files are
yield by an input function. Such flag is forwarded to all elements yield
by the input function.
@cclienti cclienti force-pushed the fix/update-rule-missing-file branch from 72a1c17 to ceb3ba9 Compare March 10, 2022 15:53
@sonarcloud
Copy link

sonarcloud bot commented Mar 10, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@johanneskoester
Copy link
Contributor

Hi, thanks a lot for the proposal. However, I think Snakemake has a mechanism for that purpose already, which is --list-input-changes. The latest versions of Snakemake already warn you in such cases at the end of a dryrun and suggest a command to trigger a rerun. Hence, there is not even manual flagging needed as you have implemented it here. What do you think?

@cclienti
Copy link
Contributor Author

Thank you for the response. I agree and I tried the proposed mechanism, but sometimes the command line length is too long because of thousands new files. Moreover the DAG is analyzed two times which can be long for some large workflows (>100k nodes).

So, in such case I was forced to create a snakemake-update python code that parses the arguments (using the internal API) to add the "list-input-changes" option and run snakemake to get the target python list. Then the snakemake can run a second times with the python list of changes. A same class solution would be to pass to snakemake command line the targets using a file, but the two run mechanism would be the same.

I was thinking that adding a flag would be nicer solution for such use case (only one snakemake run). But maybe I did not see the big picture or implemented things correctly.

@johanneskoester
Copy link
Contributor

I agree about the wasting of runtime. There is anyway the plan to implement these auxilliary change detections as an arg like --rerun-changes params input code. The mechanism would be basically the same as now, but no recomputation would be needed because it can happen directly before the needrun determination in the DAG. Would that be a solution to your problem?

@johanneskoester
Copy link
Contributor

See also here: #1107

So, maybe it is time to finally implement that now. I'll try to come to that ASAP.

@cclienti
Copy link
Contributor Author

I agree about the wasting of runtime. There is anyway the plan to implement these auxilliary change detections as an arg like --rerun-changes params input code. The mechanism would be basically the same as now, but no recomputation would be needed because it can happen directly before the needrun determination in the DAG. Would that be a solution to your problem?

Yes this is also a good solution 👍

@johanneskoester
Copy link
Contributor

This will now be handled differently but as desired in PR #1663. Thanks for the help!

@cclienti
Copy link
Contributor Author

Thank you for mentioning me in the release note 7.8.0. You just made a little error on my nick, this is cclienti, not cclienty :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants