Continue resource handling after being cancelled before #221

lucas-koehler · 2023-08-02T12:37:10Z

Is your feature request related to a problem? Please describe.

Follow up of #200.

When the operator starts handling a CR (i.e. AppDefinition, Session, Workspace), it marks the resource as HANDLING. If the operator is killed (e.g. by Kubernetes due to resources or an update), the resource is never set to HANDLED.
Currently, resources in HANDLING are never processed again to prevent handling resources that caused unexpected crashes,
However, unexpected crashes can be caught (and should already after #200).

Describe the solution you'd like

When a resource with status HANDLING is processed again, the operator needs to consider that encountering an already completed step does not mean that handling was finished. Instead, the operator should only execute the missing steps.

Describe alternatives you've considered

None, eventually we need to be safe against operators being stopped while handling a resource.

Cluster provider

No response

Additional information

No response

The text was updated successfully, but these errors were encountered:

Adds support to the operator to handle AppDefinitions that are in HANDLING state. This happens when the operator was unexpectedly shut down while handling the AppDefinition before. No logic changes in creating Kubernetes resources are necessary because the handling was already idempotent. Contributed on behalf of STMicroelectronics

The Workspace handling already makes sure steps are not executed twice. Thus, we can simply try handling again Contributed on behalf of STMicroelectronics

github-actions · 2024-02-27T09:33:57Z

This issue is stale because it has been open for 180 days with no activity.

jfaltermeier · 2024-02-28T08:12:58Z

Keep open

lucas-koehler added the enhancement New feature or request label Aug 2, 2023

lucas-koehler changed the title ~~Continue resource handling when it was cancelled unexpectedly before~~ Continue resource handling after being cancelled before Aug 2, 2023

lucas-koehler mentioned this issue Aug 2, 2023

Initial crash loop detection #216

Merged

jfaltermeier added this to the OS Week 23 milestone Aug 2, 2023

lucas-koehler mentioned this issue Aug 29, 2023

Set CRs to error status if handling was interrupted before #233

Merged

lucas-koehler self-assigned this Aug 30, 2023

lucas-koehler added a commit that referenced this issue Aug 31, 2023

#221: Continue cancelled handling for Workspaces

fa589ff

The Workspace handling already makes sure steps are not executed twice. Thus, we can simply try handling again Contributed on behalf of STMicroelectronics

github-actions bot added the stale label Feb 27, 2024

github-actions bot removed the stale label Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continue resource handling after being cancelled before #221

Continue resource handling after being cancelled before #221

lucas-koehler commented Aug 2, 2023

github-actions bot commented Feb 27, 2024

jfaltermeier commented Feb 28, 2024

Continue resource handling after being cancelled before #221

Continue resource handling after being cancelled before #221

Comments

lucas-koehler commented Aug 2, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Cluster provider

Additional information

github-actions bot commented Feb 27, 2024

jfaltermeier commented Feb 28, 2024