New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling update for shinyproxy deployment causes orphan pods #169
Comments
Hi @ramkumarg1 When shinyproxy receives as SIGTERM signal (when the deployment is scaled down), it should gracefully terminate by stopping all application pods first. You may have to increase the grace period |
Thanks @dseynaev I changed the deployment spec to include terminationGracePeriodSeconds - but it didnt' make a difference. The pod was killed immediately - Perhaps, this issue is linked to kubernetes/kubernetes#47576 where spring boot needs to handle SIGTERM gracefully?
|
We observe the same issue with zombie pods, and for us the termination grace period setting also does not resolve this. |
I have the same issue and this is what is logged by shiny/containerproxy upon termination:
|
I found a solution for this issue. This is not actually a problem in shinyproxy or containerproxy as the Spring Boot app is correctly and gracefully shut down. The problem is the I read that Kubernetes is about to solve these startup and shutdown dependencies in v1.18 as documented here: Until then there is a simple workaround to put the following lifecycle annotation to the sidecar container:
|
Allow graceful shutdown by delaying the SIGTERM to the sidecar container by some time, for example, 5s. This solves the issue here: openanalytics/shinyproxy#169
I can confirm @fmannhardt's fix resolves this. Thank you so much! |
Hi all With recent versions of ShinyProxy (I'm not sure which version exactly, but at least ShinyProxy 2.3.1) there is no need to use a kube-proxy sidecar. ShinyProxy automatically detects the location and authentication of the Kubernetes API. |
Hi, when there is a change in application.yaml and the rolling update is chosen (with replicas set to 0 and then back to 1) - mainly because the new shinyproxy image needs to be downloaded from the artifactory - All the earlier pods that were spun up by the previous shinyproxy get left behind as zombie's
To reproduce:
NAME READY STATUS RESTARTS AGE
pod/shinyproxy-7f76d48c79-8x9hs 2/2 Running 0 41m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/shinyproxy NodePort 172.30.85.191 8080:32094/TCP 40m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/shinyproxy 1 1 1 1 41m
NAME DESIRED CURRENT READY AGE
replicaset.apps/shinyproxy-7f76d48c79 1 1 1 41m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
route.route.openshift.io/shinyproxy shinyproxy-aap.apps.cpaas.service.test shinyproxy None
NAME READY STATUS RESTARTS AGE
pod/shinyproxy-7f76d48c79-8x9hs 2/2 Running 0 43m
pod/sp-pod-e7603441-03ba-470b-925a-22cfba1716de 1/1 Running 0 12s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/shinyproxy NodePort 172.30.85.191 8080:32094/TCP 43m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/shinyproxy 1 1 1 1 43m
NAME DESIRED CURRENT READY AGE
replicaset.apps/shinyproxy-7f76d48c79 1 1 1 43m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
route.route.openshift.io/shinyproxy shinyproxy-aap.apps.cpaas.service.test shinyproxy None
kubectl scale --replicas=0 deployment/shinyproxy
deployment.extensions/shinyproxy scaled
kubectl scale --replicas=1 deployment/shinyproxy
deployment.extensions/shinyproxy scaled
NAME READY STATUS RESTARTS AGE
pod/shinyproxy-7f76d48c79-l5fvw 0/2 ContainerCreating 0 4s
pod/sp-pod-e7603441-03ba-470b-925a-22cfba1716de 1/1 Running 0 1m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/shinyproxy NodePort 172.30.85.191 8080:32094/TCP 44m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/shinyproxy 1 1 1 0 45m
NAME DESIRED CURRENT READY AGE
replicaset.apps/shinyproxy-7f76d48c79 1 1 0 45m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
route.route.openshift.io/shinyproxy shinyproxy-aap.apps.cpaas.service.test shinyproxy None
At this stage my web-application is irresponsive - the only thing to do is to close the tab/window. And the pod (for the R application) continues to stay unless manually deleted.
The pod which is consuming resources is not accessible, because the new service points to the updated deployment and application can be only accessed through a route over the service
It also is very difficult to identify which of the pods are the stale ones and delete manually
The text was updated successfully, but these errors were encountered: