Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

brig error on deploy wire-server #261

Open
maaaaaaav opened this issue May 12, 2020 · 6 comments
Open

brig error on deploy wire-server #261

maaaaaaav opened this issue May 12, 2020 · 6 comments

Comments

@maaaaaaav
Copy link

maaaaaaav commented May 12, 2020

Hi there,

thanks again for all the help and assistance.

Currently trying to deploy wire-server using helm; everything is working fine except for when the brig kubes are stopping on CrashLoopBackOff

When i pull the logs, this is all I get:

wireadmin@wire-controller:~/wire-server-deploy/ansible$ kubectl logs brig-8674744bc7-ccbtf
{"logger":"cassandra.brig","msgs":["I","Known hosts: [datacenter1:rack1:172.16.32.31:9042,datacenter1:rack1:172.16.32.32:9042,datacenter1:rack1:172.16.32.33:9042]"]}
{"logger":"cassandra.brig","msgs":["I","New control connection: datacenter1:rack1:172.16.32.33:9042#<socket: 11>"]}
NAME                                  READY   STATUS             RESTARTS   AGE
brig-8674744bc7-ccbtf                 0/1     CrashLoopBackOff   6          7m58s
brig-8674744bc7-jlpgn                 0/1     CrashLoopBackOff   7          7m58s
brig-8674744bc7-mbh5m                 0/1     CrashLoopBackOff   7          7m58s
cannon-0                              1/1     Running            0          7m58s
cannon-1                              1/1     Running            0          7m58s
cannon-2                              1/1     Running            0          7m58s
cargohold-d474c7847-mpj7w             1/1     Running            0          7m58s
cargohold-d474c7847-phms7             1/1     Running            0          7m58s
cargohold-d474c7847-r4j8b             1/1     Running            0          7m58s
cassandra-migrations-g667z            0/1     Completed          0          8m7s
demo-smtp-84b7b85ff6-k2djh            1/1     Running            0          9h
elasticsearch-index-create-xnzwm      0/1     Completed          0          8m1s
fake-aws-dynamodb-84f87cd86b-dsz2v    2/2     Running            0          9h
fake-aws-s3-5468cdf989-fccm9          1/1     Running            0          9h
fake-aws-s3-reaper-7c6d9cddd6-ff8fn   1/1     Running            0          9h
fake-aws-sns-5c56774d95-dwcsw         2/2     Running            0          9h
fake-aws-sqs-554bbc684d-cqxzl         2/2     Running            0          9h
galley-87df7b65f-kp588                1/1     Running            0          7m58s
galley-87df7b65f-t7wtd                1/1     Running            0          7m58s
galley-87df7b65f-vhzpg                1/1     Running            0          7m58s
gundeck-f9bf469f9-b9rxt               1/1     Running            0          7m58s
gundeck-f9bf469f9-clff6               1/1     Running            0          7m58s
gundeck-f9bf469f9-gx8d4               1/1     Running            0          7m57s
nginz-77f7ff6f5d-5m94p                2/2     Running            1          7m58s
nginz-77f7ff6f5d-h7w5n                2/2     Running            1          7m58s
nginz-77f7ff6f5d-pwbzl                2/2     Running            1          7m58s
redis-ephemeral-69bb4885bb-qbmdw      1/1     Running            0          8h
spar-59fd5db594-gbsbz                 1/1     Running            0          7m58s
spar-59fd5db594-jclmh                 1/1     Running            0          7m58s
spar-59fd5db594-zvbl6                 1/1     Running            0          7m58s
webapp-6cb84759d9-wfhc9               1/1     Running            0          7m58s
wireadmin@wire-controller:~/wire-server-deploy/ansible$

those are the correct IPs for my three cassandra nodes and they seem to be up fine. I'm using cassandra-external to point them there.

any guidance as to what I should upload to help with this would be much appreciated too.

Thanks!

@akshaymankar
Copy link
Member

Hello @maaaaaaav, sometimes while pods are in crash loop, the logs can be from just before it crashed. The logs that you've added don't look like failure. Can you please check the logs of a couple of times again see if there is anything new?

@ramesh8830
Copy link

I had the same issue especially when trying to configure own smtp server using #266. Below are the warnings and failure messages inside kubectl describe pod.

Normal Scheduled 4m22s default-scheduler Successfully assigned production/brig-69969b5bdc-ndn8b to kubenode02
Warning Unhealthy 3m29s (x5 over 4m9s) kubelet, kubenode02 Readiness probe failed: Get http://10.233.65.172:8080/i/status: dial tcp 10.233.65.172:8080: connect: connection refused
Normal Pulling 3m23s (x3 over 4m20s) kubelet, kubenode02 Pulling image "quay.io/wire/brig:latest"
Warning Unhealthy 3m23s (x6 over 4m13s) kubelet, kubenode02 Liveness probe failed: Get http://10.233.65.172:8080/i/status: dial tcp 10.233.65.172:8080: connect: connection refused
Normal Killing 3m23s (x2 over 3m53s) kubelet, kubenode02 Container brig failed liveness probe, will be restarted
Normal Pulled 3m22s (x3 over 4m16s) kubelet, kubenode02 Successfully pulled image "quay.io/wire/brig:latest"
Normal Created 3m22s (x3 over 4m16s) kubelet, kubenode02 Created container brig
Normal Started 3m22s (x3 over 4m15s) kubelet, kubenode02 Started container brig

@akshaymankar
Copy link
Member

@ramesh8830 Do you also see nothing interesting in kubectl logs for the brig pods?

@ramesh8830
Copy link

ramesh8830 commented May 20, 2020

Hi @akshaymankar

Nothing there in the kubectl logs for brig pods. Brig pods keep waiting on the Ready Status and eventually fall into CrashLoopBackOff.

I am also getting same log @maaaaaaav reported in his post showing all cassandra nodes.

wireadmin@wire-controller:~/wire-server-deploy/ansible$ kubectl logs brig-8674744bc7-ccbtf
{"logger":"cassandra.brig","msgs":["I","Known hosts: [datacenter1:rack1:172.16.32.31:9042,datacenter1:rack1:172.16.32.32:9042,datacenter1:rack1:172.16.32.33:9042]"]}
{"logger":"cassandra.brig","msgs":["I","New control connection: datacenter1:rack1:172.16.32.33:9042#<socket: 11>"]}

@akshaymankar
Copy link
Member

 Warning Unhealthy 3m23s (x6 over 4m13s) kubelet, kubenode02 Liveness probe failed: Get http://10.233.65.172:8080/i/status: dial tcp 10.233.65.172:8080: connect: connection refused

This indicates that brig is taking some time to come up and K8s is not patient enough for that. Usually brig prints a line like this when it starts listening on the port:

I, Listening on 0.0.0.0:8080

I would make sure that the pod is getting enough CPU/RAM. And if that is the case, I would bump up the logging in brig to Debug or even Trace and see if I find anything in the logs. Hope this helps!

@ramesh8830
Copy link

Its has enough CPU/RAM. It happens only when we use user name password for SMTP configuration other than demo credentials. If I use demo credentials for smtp then brig pods running successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants