Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]mogdb switchover failed #7184

Open
ahjing99 opened this issue Apr 26, 2024 · 2 comments
Open

[BUG]mogdb switchover failed #7184

ahjing99 opened this issue Apr 26, 2024 · 2 comments
Assignees
Labels
kind/bug Something isn't working Stale
Milestone

Comments

@ahjing99
Copy link
Collaborator

➜ ~ kbcli version
Kubernetes: v1.28.7-gke.1026000
KubeBlocks: 0.9.0-beta.15
kbcli: 0.9.0-beta.4

# Add Helm repo 
helm repo add kubeblocks-addons https://apecloud.github.io/helm-charts
# If github is not accessible or very slow for you, please use following repo instead
helm repo add kubeblocks-addons https://jihulab.com/api/v4/projects/150246/packages/helm/stable
# Update helm repo
helm repo update
# Update mogdb to enable hostnetwork
helm upgrade -i kb-addon-mogdb kubeblocks-addons/mogdb  -n kb-system --version 0.9.0

  1. Create cluster ,k apply -f cluster.yaml
 apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  name: mogdb-cluster
  namespace: default
spec:
  clusterDefinitionRef: mogdb
  clusterVersionRef: mogdb-5.0.5
  terminationPolicy: Delete
  componentSpecs:
  - name: mogdb
    componentDefRef: mogdb
    enabledLogs:
    - running
    serviceAccountName: kb-mogdb-cluster
    replicas: 2
    resources:
      limits:
        cpu: '0.5'
        memory: 0.5Gi
      requests:
        cpu: '0.5'
        memory: 0.5Gi
    volumeClaimTemplates:
    - name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
  1. switchover
create role: 

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: mogdb-cluster-switchover-role
  labels:
    app.kubernetes.io/instance: mogdb-cluster
rules:
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: mogdb-cluster-switchover
  labels:
    app.kubernetes.io/instance: mogdb-cluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: mogdb-cluster-switchover-role
subjects:
  - kind: ServiceAccount
    name: kb-mogdb-cluster
    namespace: default

➜  ~ kbcli cluster describe mogdb-cluster
Name: mogdb-cluster	 Created Time: Apr 26,2024 11:52 UTC+0800
NAMESPACE   CLUSTER-DEFINITION   VERSION       STATUS    TERMINATION-POLICY
default     mogdb                mogdb-5.0.5   Running   Delete

Endpoints:
COMPONENT   MODE        INTERNAL                                              EXTERNAL
mogdb       ReadWrite   mogdb-cluster-mogdb.default.svc.cluster.local:26000   <none>

Topology:
COMPONENT   INSTANCE                ROLE        STATUS    AZ              NODE                                                  CREATED-TIME
mogdb       mogdb-cluster-mogdb-0   primary     Running   us-central1-c   gke-yjtest-default-pool-e77a0986-5w42/10.128.15.226   Apr 26,2024 13:15 UTC+0800
mogdb       mogdb-cluster-mogdb-1   secondary   Running   us-central1-c   gke-yjtest-default-pool-e77a0986-3xfx/10.128.0.52     Apr 26,2024 13:15 UTC+0800

Resources Allocation:
COMPONENT   DEDICATED   CPU(REQUEST/LIMIT)   MEMORY(REQUEST/LIMIT)   STORAGE-SIZE   STORAGE-CLASS
mogdb       false       1 / 1                1Gi / 1Gi               data:30Gi      kb-default-sc

Images:
COMPONENT   TYPE    IMAGE
mogdb       mogdb   swr.cn-north-4.myhuaweicloud.com/mogdb/mogdb:5.0.5

Data Protection:
BACKUP-REPO   AUTO-BACKUP   BACKUP-SCHEDULE   BACKUP-METHOD   BACKUP-RETENTION

Show cluster events: kbcli cluster list-events -n default mogdb-cluster


➜  ~ kbcli cluster custom-ops mogdb-switchover --cluster mogdb-cluster  --component mogdb --auto-approve --candidate mogdb-cluster-mogdb-1
args: [mogdb-switchover --cluster mogdb-cluster --component mogdb --auto-approve --candidate mogdb-cluster-mogdb-1]
OpsRequest mogdb-cluster-custom-lxkqv created successfully, you can view the progress:
	kbcli cluster describe-ops mogdb-cluster-custom-lxkqv -n default

➜  ~ kbcli cluster describe-ops mogdb-cluster-custom-lxkqv -n default
Spec:
  Name: mogdb-cluster-custom-lxkqv	NameSpace: default	Cluster: mogdb-cluster	Type: Custom

Command: <none>

Status:
  Start Time:         Apr 26,2024 14:11 UTC+0800
  Completion Time:    Apr 26,2024 14:14 UTC+0800
  Duration:           2m9s
  Status:             Failed
  Progress:           1/1
                      OBJECT-KEY   STATUS   DURATION   MESSAGE
                                   Failed   2m7s       the action "switchover" of the component "mogdb" is Failed

Conditions:
LAST-TRANSITION-TIME         TYPE                 REASON                     STATUS   MESSAGE
Apr 26,2024 14:11 UTC+0800   WaitForProgressing   WaitForProgressing         True     wait for the controller to process the OpsRequest: mogdb-cluster-custom-lxkqv in Cluster: mogdb-cluster
Apr 26,2024 14:11 UTC+0800   Validated            ValidateOpsRequestPassed   True     OpsRequest: mogdb-cluster-custom-lxkqv is validated
Apr 26,2024 14:11 UTC+0800   CustomOperation      MogdbSwitchoverStarting    True     Start to handle MogdbSwitchover on the Cluster: mogdb-cluster
Apr 26,2024 14:14 UTC+0800   Failed               OpsRequestFailed           False    Failed to process OpsRequest: mogdb-cluster-custom-lxkqv in cluster: mogdb-cluster, more detailed informations in status.components

Warning Events:
TIME                         TYPE      REASON             OBJECT                                  MESSAGE
Apr 26,2024 14:13 UTC+0800   Warning   Failed             OpsRequest/mogdb-cluster-custom-lxkqv   the action "switchover" of the component "mogdb" is Failed
Apr 26,2024 14:14 UTC+0800   Warning   OpsRequestFailed   OpsRequest/mogdb-cluster-custom-lxkqv   Failed to process OpsRequest: mogdb-cluster-custom-lxkqv in cluster: mogdb-cluster, more detailed informations in status.components

➜  ~ k logs 3ae6b12c-mogdb-cluster-cust-mogdb-switchover-0-qhsfv
Defaulted container "switchover" out of: switchover, ops-utils (init)
INFO: doing switchover..
INFO: candidate: mogdb-cluster-mogdb-1
+ echo 'INFO: doing switchover..'
+ echo 'INFO: candidate: mogdb-cluster-mogdb-1'
+ kubectl exec -it mogdb-cluster-mogdb-1 -c mogdb -- gosu omm gs_ctl switchover
Unable to use a TTY - input is not a terminal or the right kind of file
[2024-04-26 06:11:53.862][22476][][gs_ctl]: gs_ctl switchover ,datadir is /var/lib/mogdb/data
[2024-04-26 06:11:53.862][22476][][gs_ctl]: switchover term (1)
[2024-04-26 06:11:53.872][22476][][gs_ctl]: waiting for server to switchover...............................................................
[2024-04-26 06:12:54.498][22476][][gs_ctl]:
 switchover timeout after 60 seconds. please manually check the cluster status.
INFO: start to check if switchover successfully, timeout is 60s
+ echo 'INFO: start to check if switchover successfully, timeout is 60s'
+ date '+%s'
+ executedUnix=1714111974
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson+
jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714111979
+ diff_time=5
+ '[' 5 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson+ jq -r '.metadata.labels["kubeblocks.io/role"]'

+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714111984
+ diff_time=10
+ '[' 10 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714111989
+ diff_time=15
+ '[' 15 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714111995
+ diff_time=21
+ '[' 21 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112000
+ diff_time=26
+ '[' 26 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112005
+ diff_time=31
+ '[' 31 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112010
+ diff_time=36
+ '[' 36 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112015
+ diff_time=41
+ '[' 41 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112020
+ diff_time=46
+ '[' 46 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112025
+ diff_time=51
+ '[' 51 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112030
+ diff_time=56
+ '[' 56 -ge 60 ]
+ true
+ sleep 5
+ '[' '!' -z mogdb-cluster-mogdb-1 ]
+ kubectl get pod mogdb-cluster-mogdb-1 -ojson
+ jq -r '.metadata.labels["kubeblocks.io/role"]'
+ role=secondary
+ '[' secondary '==' Primary ]
+ '[' secondary '==' primary ]
+ '[' secondary '==' leader ]
+ '[' secondary '==' master ]
+ date '+%s'
+ currentUnix=1714112035
+ diff_time=61
+ '[' 61 -ge 60 ]
+ echo 'ERROR: switchover failed.'
+ exit 1
ERROR: switchover failed.

➜ ~ kbcli report cluster --with-logs --all-containers mogdb-cluster
reporting cluster information to report-cluster-mogdb-cluster-2024-04-26-14-15-47.zip
processing manifests OK
processing events OK
process pod logs

➜ ~ kbcli report kubeblocks --with-logs --all-containers --output yaml
reporting KubeBlocks information to report-kubeblocks-2024-04-26-14-16-17.zip
processing manifests OK
processing events OK
process pod logs OK
report-kubeblocks-2024-04-26-14-16-17.zip
report-cluster-mogdb-cluster-2024-04-26-14-15-47.zip

@ahjing99 ahjing99 added kind/bug Something isn't working severity/major Great chance user will encounter the same problem labels Apr 26, 2024
@ahjing99 ahjing99 added this to the Release 0.9.0 milestone Apr 26, 2024
@JashBook
Copy link
Collaborator

dup apecloud/kubeblocks-addons#394

@ahjing99 ahjing99 removed the severity/major Great chance user will encounter the same problem label Apr 26, 2024
Copy link

This issue has been marked as stale because it has been open for 30 days with no activity

@github-actions github-actions bot added the Stale label May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

4 participants