Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pipeline truly resumable #1219

Closed
aghr opened this issue Feb 21, 2024 · 2 comments
Closed

Make pipeline truly resumable #1219

aghr opened this issue Feb 21, 2024 · 2 comments
Labels
question Further information is requested
Milestone

Comments

@aghr
Copy link

aghr commented Feb 21, 2024

Description of feature

When the pipeline has finished once with success and re-running it with -resume, it does not truly resume. I'd expect that the pipeline recognizes that all steps have run with success already and takes caches results without doing any computations. But, in reality somehow mapping indices from Salmon (and maybe STAR) when running in salmon_star mode are not saved. This has the consequence that these indices will have to be re-build with -resume which has the further consequence that all downstream steps (mapping, etc) will be re-run because the upstream step of index construction was re-run. This may waste computational resources and users have to wait longer for results to be re-computed that were already computed before with success.

@MatthiasZepper
Copy link
Member

MatthiasZepper commented Feb 21, 2024

That is not really what the -resume functionality is meant for. It enables fixing issues along the way that have caused a workflow execution to grind to a halt. It is not meant to preserve computation results/assets for entirely different runs once the execution finished successfully.

Please use the --save-references parameter and specify the resulting files as input parameters (--star_index, --salmon_index) for subsequent runs.

@drpatelh drpatelh added question Further information is requested and removed enhancement labels May 13, 2024
@drpatelh drpatelh modified the milestone: 3.15.0 May 13, 2024
@drpatelh
Copy link
Member

Agree. Nextflow is inherently only resumable at the run-level and not across multiple runs. If you want to store assets more longer term then most pipelines will have parameters like --save_reference to store these files in more permanent storage for re-use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants