Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubegres can't recover if all statefulsets are deleted #141

Open
adamlamar opened this issue Nov 30, 2022 · 0 comments
Open

kubegres can't recover if all statefulsets are deleted #141

adamlamar opened this issue Nov 30, 2022 · 0 comments

Comments

@adamlamar
Copy link

Starting with a health cluster with 3 replicas:

$ kubectl describe kubegres postgres-uaa
Status:
  Blocking Operation:
    Stateful Set Operation:
    Stateful Set Spec Update Operation:
  Enforced Replicas:            4
  Last Created Instance Index:  5
  Previous Blocking Operation:
    Operation Id:  Replica DB count spec enforcement
    Stateful Set Operation:
      Instance Index:  5
      Name:            postgres-uaa-5
    Stateful Set Spec Update Operation:
    Step Id:                   Replica DB is deploying
    Time Out Epoc In Seconds:  1669789919
Events:                        <none>

Delete the statefulsets:

$ kubectl delete sts postgres-uaa-2 postgres-uaa-4 postgres-uaa-5

The following error is seen:

Events:
  Type    Reason                                   Age                From                 Message
  ----    ------                                   ----               ----                 -------
  Normal  FailoverCannotHappenAsNoReplicaDeployed  25s (x2 over 26s)  Kubegres-controller  A failover is required for a Primary Pod as it is not healthy. However, a failover cannot happen because there is not any Replica deployed.

The error makes sense because no replica is available. However, its unclear how to recover the cluster. Although the statefulsets were deleted, the PVCs still exist, and the database is intact.

Using promotePod is not possible because we cannot promote a pod that is not running.

As a workaround, I was able to manually create a statefulset out of band, and then promote the pod. But this process was kind of error prone (editing index labels) and unclear. I'm not sure I did it right, but it seemed to work eventually.

Feature idea: maybe a promotePVC option that can start the statefulset from an existing PVC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant