You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
That is more of a feature request/clarification rather than a bug.
So, I need to checkpoint a container that was started with the apptainer run ... command rather than the apptainer instance start/run command. The container runs a very long analysis and it would be helpful to checkpoint it from time to time so, that if something goes wrong with the container runtime or host system, the analysis can be started from the last checkpoint instead of the beginning.
The problem is that apptainer instance run/start command works as a service (as expected) and
It does not stop when the runscript finishes.
Output of the runscript is not really accessible unless it is written to a file inside the runscript.
I can see the issue with using apptainer run in this case as it runs in the foreground, but it is not possible to set up a checkpoint saving loop after launching the container unless it is sent to the background. But that will still solve both issues mentioned above...
Unfortunately, I do not see how the container ran with apptainer run can be checkpointed as this command does not have --dmtcp_... options to launch or restart it and thus does not allow to associate the checkpoint location.
Is it even possible to do that way?
Thank you very much in advance.
The text was updated successfully, but these errors were encountered:
That is more of a feature request/clarification rather than a bug.
So, I need to checkpoint a container that was started with the
apptainer run ...
command rather than theapptainer instance start/run
command. The container runs a very long analysis and it would be helpful to checkpoint it from time to time so, that if something goes wrong with the container runtime or host system, the analysis can be started from the last checkpoint instead of the beginning.The problem is that
apptainer instance run/start
command works as a service (as expected) andI can see the issue with using
apptainer run
in this case as it runs in the foreground, but it is not possible to set up a checkpoint saving loop after launching the container unless it is sent to the background. But that will still solve both issues mentioned above...Unfortunately, I do not see how the container ran with
apptainer run
can be checkpointed as this command does not have--dmtcp_...
options to launch or restart it and thus does not allow to associate the checkpoint location.Is it even possible to do that way?
Thank you very much in advance.
The text was updated successfully, but these errors were encountered: