Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hub and Proxy running but getting 502 Bad gateway #32

Open
ClementGautier opened this issue Dec 15, 2021 · 4 comments
Open

Hub and Proxy running but getting 502 Bad gateway #32

ClementGautier opened this issue Dec 15, 2021 · 4 comments
Labels
bug Something isn't working as expected

Comments

@ClementGautier
Copy link
Contributor

ClementGautier commented Dec 15, 2021

Describe the bug:

It looks like sometimes, the proxy loose connection with the hub and we need to kill the proxy pod to force it's recreation.

Expected behaviour:

I expect the application to be reachable even after a restart of the hub

Steps to reproduce the issue:

helm install
kubectl delete pod hub-***
you should see 502 bad gateway even after the hub pod is shown as running

EDIT: rebooting the node seems to be the only wait to reproduce the issue consistently

Possible Fix:

A quick fix might be to put liveness probes on the proxy pods to ensure the connexion still exists but there might be a better fix.

@ClementGautier ClementGautier added the bug Something isn't working as expected label Dec 15, 2021
@ClementGautier
Copy link
Contributor Author

So, after spending more time on that issue I can tell you it's not that easy to reproduce.
I haven't been able to reproduce yet and I think it's related to NetworkPolicies, the issue seems very similar to this issue: jupyterhub/zero-to-jupyterhub-k8s#1863
I will try to have a reproducible case.

@ClementGautier
Copy link
Contributor Author

ClementGautier commented Dec 22, 2021

I was able to confirm that the issue is between the ingress and the service, most likely a firewall issue.
I'm getting connection resets between the ingress controller and the service. 10.56.2.25 is the proxy-public service enpoint while .16 is the nginx ingress controller

image

@ClementGautier
Copy link
Contributor Author

I think I found the issue: the proxy-public service points to the port 8080 of the proxy pod but this one doesn't listen to this port and instead listen to the port 8000... editing the service to use the 8000 fixed the issue. After more digging in the values it seems I needed to use proxy.https.type: offload in combination with mlhub.env.SSL_ENABLED: true. This configure the service "properly", but then, the ingress don't work at all as the target port is hardcoded to servicePort: 80. So I disabled the ingress templating and created it manually as a temporary workaround.

I still don't understand how restarting the proxy pod made it work all of the sudden. I think it have to do with the environment variables being set and the behavior of the proxy itself but I guess if you launch the container with the option --port 8000 you should probably use that port anyway.

I'll make a pull request in that direction soon

@ClementGautier
Copy link
Contributor Author

The issue is already fixed in the jupyterhub chart so instead of doing things twice I'd rather prefer using this as dependency for this chart as discussed in #25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected
Projects
None yet
Development

No branches or pull requests

1 participant