Skip to content

workflow checkpointing

No due date 0% complete

The state of the workflow execution needs to be checkpoints, identifiable, and restorable. Possibly the trickiest part of this is interacting with the GROMACS checkpointing mechanism. This shouldn't be too hard, but there are several tasks.

  • Provide data structures as Session resources that the Context implementation can checkpoint. External code can use…

The state of the workflow execution needs to be checkpoints, identifiable, and restorable. Possibly the trickiest part of this is interacting with the GROMACS checkpointing mechanism. This shouldn't be too hard, but there are several tasks.

  • Provide data structures as Session resources that the Context implementation can checkpoint. External code can use these data structures to maintain state.
  • Allow Context to inspect GROMACS checkpoint files to determine timestep.
  • Allow Context to automatically manipulate simulation input files to restore MD operation to a known state.
  • Allow a configurable interval of number of data events between checkpoints of data objects. This will probably only make sense to manage from the Session code, which will interact with GROMACS.
  • Presumably with some sort of checkpoint-participant interface in GROMACS, get a signal from GROMACS when a checkpoint is made and don't allow the workflow to proceed until a consistent set of checkpoints are made throughout. We may prefer something more abstract, but I don't know what that is right now.
  • Initialize nodes and replay the required data events. This is probably connected to discussions on the nature of the execution graph and edges. I think the entire state of the graph consists of the edge state and the initialization values of source nodes.

Some of this can be clarified through discussions on refining the semantics of data graph execution, data events, and data state.