Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noobaa install stuck at "System Phase is "Connecting". Waiting for phase ready ..." #1128

Open
arttor opened this issue May 11, 2023 · 1 comment

Comments

@arttor
Copy link

arttor commented May 11, 2023

Environment info

  • NooBaa Operator Version: 5.11.0
  • Platform: K8s v1.22.3 on minikube 1.24.0 5CPU 8GB

Actual behavior

  1. Operator stuck in phase connecting: System Phase is "Connecting". Waiting for phase ready ...

Expected behavior

  1. System Phase is "Ready".

Steps to reproduce

  1. noobaa install --mini --disable-load-balancer
  2. also tried noobaa install --mini - same result

More information - Screenshots / Logs / Other output

It looks like core service doesn't serve endpoints.

  • operator cannot ping core api
  • cannot open admin web UI
  • cannot curl core pod from k8s cluster
noobaa status
INFO[0000] CLI version: 5.11.0                          
INFO[0000] noobaa-image: noobaa/noobaa-core:5.11.0      
INFO[0000] operator-image: noobaa/noobaa-operator:5.11.0 
INFO[0000] noobaa-db-image: centos/postgresql-12-centos7 
INFO[0000] Namespace: noobaa                            
INFO[0000]                                              
INFO[0000] CRD Status:                                  
INFO[0000] ✅ Exists: CustomResourceDefinition "noobaas.noobaa.io" 
INFO[0000] ✅ Exists: CustomResourceDefinition "backingstores.noobaa.io" 
INFO[0000] ✅ Exists: CustomResourceDefinition "namespacestores.noobaa.io" 
INFO[0000] ✅ Exists: CustomResourceDefinition "bucketclasses.noobaa.io" 
INFO[0000] ✅ Exists: CustomResourceDefinition "noobaaaccounts.noobaa.io" 
INFO[0000] ✅ Exists: CustomResourceDefinition "objectbucketclaims.objectbucket.io" 
INFO[0000] ✅ Exists: CustomResourceDefinition "objectbuckets.objectbucket.io" 
INFO[0000]                                              
INFO[0000] Operator Status:                             
INFO[0000] ✅ Exists: Namespace "noobaa"                 
INFO[0000] ✅ Exists: ServiceAccount "noobaa"            
INFO[0000] ✅ Exists: ServiceAccount "noobaa-endpoint"   
INFO[0000] ✅ Exists: Role "noobaa"                      
INFO[0000] ✅ Exists: Role "noobaa-endpoint"             
INFO[0000] ✅ Exists: RoleBinding "noobaa"               
INFO[0000] ✅ Exists: RoleBinding "noobaa-endpoint"      
INFO[0000] ✅ Exists: ClusterRole "noobaa.noobaa.io"     
INFO[0000] ✅ Exists: ClusterRoleBinding "noobaa.noobaa.io" 
INFO[0000] ⬛ (Optional) Not Found: ValidatingWebhookConfiguration "admission-validation-webhook" 
INFO[0000] ⬛ (Optional) Not Found: Secret "admission-webhook-secret" 
INFO[0000] ⬛ (Optional) Not Found: Service "admission-webhook-service" 
INFO[0000] ✅ Exists: Deployment "noobaa-operator"       
INFO[0000]                                              
INFO[0000] System Wait Ready:                           
INFO[0000] ⏳ System Phase is "Connecting". Waiting for phase ready ... 
INFO[0003] ⏳ System Phase is "Connecting". Waiting for phase ready ... 
INFO[0006] ⏳ System Phase is "Connecting". Waiting for phase ready ... 
kubectl get all
NAME                                   READY   STATUS    RESTARTS      AGE
pod/noobaa-core-0                      1/1     Running   1 (69s ago)   45h
pod/noobaa-db-pg-0                     1/1     Running   1 (24h ago)   45h
pod/noobaa-operator-7b78d6c98f-g8d7p   1/1     Running   2 (69s ago)   45h

NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                            AGE
service/noobaa-db-pg   ClusterIP   10.108.147.252   <none>        5432/TCP                           45h
service/noobaa-mgmt    ClusterIP   10.98.147.5      <none>        80/TCP,443/TCP,8445/TCP,8446/TCP   45h
service/s3             ClusterIP   10.106.29.203    <none>        80/TCP,443/TCP,8444/TCP,7004/TCP   45h
service/sts            ClusterIP   10.96.69.251     <none>        443/TCP                            45h

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/noobaa-operator   1/1     1            1           45h

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/noobaa-operator-7b78d6c98f   1         1         1       45h

NAME                            READY   AGE
statefulset.apps/noobaa-core    1/1     45h
statefulset.apps/noobaa-db-pg   1/1     45h
Noobaa operator error logs
time="2023-05-11T06:41:13Z" level=info msg="SetPhase: temporary error during phase \"Connecting\"" sys=noobaa/noobaa
time="2023-05-11T06:41:13Z" level=error msg="Failed to append \"/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt\" to RootCAs: open /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt: no such file or directory"
time="2023-05-11T06:41:16Z" level=error msg="RPC: closing connection (0xc001251b00) &{RPC:0xc00046bd10 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
time="2023-05-11T06:41:16Z" level=error msg="RPC: Reconnect - got error: failed to websocket dial: failed to send handshake request: Get \"https://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/\": dial tcp 10.98.147.5:443: connect: connection refused"
time="2023-05-11T06:41:19Z" level=error msg="RPC: closing connection (0xc000f7cd80) &{RPC:0xc00046bd10 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:init WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:9 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
time="2023-05-11T06:41:19Z" level=error msg="RPC: Reconnect - got error: failed to websocket dial: failed to send handshake request: Get \"https://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/\": dial tcp 10.98.147.5:443: connect: connection refused"
time="2023-05-11T06:41:19Z" level=error msg="⚠️  RPC: auth.read_auth() Call failed: RPC: connection (0xc000f7cd80) already closed &{RPC:0xc00046bd10 Address:wss://noobaa-mgmt.noobaa.svc.cluster.local:443/rpc/ State:closed WS:<nil> PendingRequests:map[] NextRequestID:0 Lock:{state:1 sema:0} ReconnectDelay:3s cancelPings:<nil>}"
time="2023-05-11T06:41:19Z" level=info msg="SetPhase: temporary error during phase \"Connecting\"" sys=noobaa/noobaa
Core logs
...
May-11 6:43:39.070 [Upgrade/50]    [L0] core.server.system_services.system_store:: system_store is running in standalone mode. skip _register_for_changes
May-11 6:43:39.765 [Upgrade/50]    [L0] UPGRADE:: system store loaded
May-11 6:43:40.168 [Upgrade/50]   [LOG] UPGRADE:: system does not exist. no need for an upgrade
May-11 6:43:40.169 [Upgrade/50]    [L0] UPGRADE:: upgrade completed successfully!
noobaa_init.sh finished
Starting ...
2023-05-11 06:44:00,168 INFO Included extra file "/root/node_modules/noobaa-core/src/deploy/NVA_build/noobaa_supervisor.conf" during parsing
2023-05-11 06:44:01,164 INFO RPC interface 'supervisor' initialized
2023-05-11 06:44:01,164 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2023-05-11 06:44:01,166 INFO supervisord started with pid 72
2023-05-11 06:44:02,174 INFO spawned: 'bg_workers' with pid 81
2023-05-11 06:44:02,178 INFO spawned: 'hosted_agents' with pid 83
2023-05-11 06:44:02,465 INFO spawned: 'logrotate' with pid 85
2023-05-11 06:44:02,567 INFO spawned: 'rsyslog' with pid 87
2023-05-11 06:44:02,765 INFO spawned: 'webserver' with pid 89
2023-05-11 06:44:03,965 INFO success: bg_workers entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-11 06:44:03,965 INFO success: hosted_agents entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-11 06:44:03,965 INFO success: logrotate entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-11 06:44:03,965 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-05-11 06:44:03,966 INFO success: webserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
rsyslogd: pthread_getschedparam failed during startup - ignoring. Error was Operation not permitted
Thu May 11 06:44:09 UTC 2023: =================================== running logrotate ===================================
error: Logrotate UID is not in passwd file.
error: found error in file logrotate_noobaa.conf, skipping
Reading state from file: /var/lib/logrotate/logrotate.status
Allocating hash table for state file, size 64 entries

Handling 0 logs

loading .env file
loading .env file
OpenSSL 1.1.1l  24 Aug 2021 setting up
OpenSSL 1.1.1l  24 Aug 2021 setting up
init_rand_seed: starting ...
read_rand_seed: opening /dev/urandom ...
init_rand_seed: starting ...
OpenSSL 1.1.1l  24 Aug 2021 setting up
read_rand_seed: opening /dev/urandom ...
load_config_local: NO LOCAL CONFIG
init_rand_seed: starting ...
read_rand_seed: opening /dev/urandom ...
load_config_local: NO LOCAL CONFIG
load_config_local: NO LOCAL CONFIG
May-11 6:46:20.767 [WebServer/89]   [LOG] CONSOLE:: loading .env file...

Postgres looks healthy, core service applied all scripts on startup.
K8s cluster has a single StorageClass:

kubectl get sc
NAME                 PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
standard (default)   k8s.io/minikube-hostpath   Delete          Immediate           false                  46h
@arttor
Copy link
Author

arttor commented May 14, 2023

I've also tried noobaa install with different combiantions of --disable-load-balancer and --mini options and got the same result - core api is not responding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant