New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Checkpointing #258
Comments
this is a great idea!
On Mon, Jul 15, 2019 at 10:56 AM Heather Kates <notifications@github.com<mailto:notifications@github.com>> wrote:
Any chance of adding a check-pointing option to a future release? Because atram assembles multiple reference loci sequentially, if a single run is cancelled for some reason (e.g. memory, time) it is a pain to remove all the output and start from the beginning, or to change the reference only to the unassembled loci. This issue is much more challenging if a user is assembling for many samples in parallel and they stopped at different loci. If atram could automatically check whether assemblies exist for a locus and skip that reference, it would make resuming cancelled jobs much easier. Thank you!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#258?email_source=notifications&email_token=AA2AZ3SLGS3Z2K3CG526OLLP7SMZZA5CNFSM4IDYJ2JKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G7IIZ2A>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA2AZ3WZ2OI3GBAEMR3H6LLP7SMZZANCNFSM4IDYJ2JA>.
|
I have been giving this some thought and the only way for this to really work would be if we kept the temporary files around. That is, if we use analogs to the --temp-dir and --keep-temp-dir options. It would still require a fair bit of work but at least it would be possible. So, if we added a --checkpoint=/path/to/checkpoint/dir option that behaves like --temp-dir option and automatically creates a --keep-temp-dir flag. The only difference is that the --checkpoint would automatically delete the data if all iterations completed. This would also necessitate a --clean-checkpoints option to get rid of old checkpoint data. Does this sound reasonable? |
Thanks for thinking about this request! I think this would be Ok- for most users, I think it would be ideal. For our purposes, I often work on so much in parallel this would cause storage issues, plus I’d of course love something that worked from the assemblies so it would work from previous runs *without* this option. But yes I think for general users and new users this would be a great option.
|
Hey Rafe,
Yes I think this sounds reasonable. Could we have the checkpoint directory only keep the temp data for the current locus and and indicator of which ones hav been done and which are still to do? That might require deleting files at each new locus, but might ease storage? Would that be possible? We can chat on Friday about this and think about it a bit?
On Fri, Sep 13, 2019 at 12:04 PM Heather Kates <notifications@github.com<mailto:notifications@github.com>> wrote:
Hi Rafe,
Thanks for thinking about this request! I think this would be Ok- for most users, I think it would be ideal. For our purposes, I often work on so much in parallel this would cause storage issues, plus I’d of course love something that worked from the assemblies so it would work from previous runs *without* this option. But yes I think for general users and new users this would be a great option.
Heather Rose
On Sep 13, 2019, at 1:39 PM, rafe <notifications@github.com<mailto:notifications@github.com><mailto:notifications@github.com<mailto:notifications@github.com>>> wrote:
EXTERNAL EMAIL: Exercise caution with links and attachments.
I have been giving this some thought and the only way for this to really work would be if we kept the temporary files around. That is, if we use analogs to the --temp-dir and --keep-temp-dir options. It would still require a fair bit of work but at least it would be possible.
So, if we added a --checkpoint=/path/to/checkpoint/dir option that behaves like --temp-dir option and automatically creates a --keep-temp-dir flag. The only difference is that the --checkpoint would automatically delete the data if all iterations completed.
This would also necessitate a --clean-checkpoints option to get rid of old checkpoint data.
Does this sound reasonable?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_juliema_aTRAM_issues_258-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DACP3VWU6UDTYBN33NDOS6W3QJPF3DA5CNFSM4IDYJ2JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6VWSYQ-23issuecomment-2D531327330&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=x1P0j4TNg-ZTfj6SndnNPA&m=3u83PhHhkP1rhX385ps3AkZph838YAn2IhaxGbPo48I&s=RPq2zylXI6i9TgJmpub1CPFG-6AliM3QaohNKcyhe3U&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ACP3VWS3CYEXLD5CY23IIXLQJPF3DANCNFSM4IDYJ2JA&d=DwMCaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=x1P0j4TNg-ZTfj6SndnNPA&m=3u83PhHhkP1rhX385ps3AkZph838YAn2IhaxGbPo48I&s=YlFts2LKiXsHOqIGfhZirNmU7o4fjK-scTEguHs8K1s&e=>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#258?email_source=notifications&email_token=AA2AZ3XLZDABPVD3FLECIETQJPP3NA5CNFSM4IDYJ2JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6V5NIA#issuecomment-531355296>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AA2AZ3UW6CF7HZVTIBJE3QLQJPP3NANCNFSM4IDYJ2JA>.
|
Any chance of adding a check-pointing option to a future release? Because atram assembles multiple reference loci sequentially, if a single run is cancelled for some reason (e.g. memory, time) it is a pain to remove all the output and start from the beginning, or to change the reference only to the unassembled loci. And of course it takes unnecessary resources to reassemble loci. This issue is more challenging if a user is assembling for many samples in parallel and they stopped at different loci. If atram could automatically check whether assemblies exist for a locus and skip that reference, it would make resuming cancelled jobs much easier. Thank you!
The text was updated successfully, but these errors were encountered: