New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow failure threshold for application health check #3435
Comments
|
Hi @philippthun |
The readiness health checks are always executed. Failing readiness health checks mean that an app instance will not be accessible (i.e. no traffic will be routed). This can happen at any time during the lifecycle. But the app process will not be restarted (that's the difference compared to the other health checks). |
HI @philippthun My original requirement is that CF runtime would give app instance a chance to recover from failing health check instead of being restart. Like Kubernetes it provides failureThreshold in this situation. |
Issue
We observed a lot of application crashes due to health check (with http request) timeout, but all other http requests were actually working right before crashes, and also all other metrics were good.
Actually, our health check endpoint is quite fast without any other logic, and also we increased the timeout to 20 seconds but it doesn't help too much.
Expected result
We are not sure why some of health checks fail due to timeout, might be CPU throttling.
But we expect that the runtime gives the application another chance to do another health check.
Current result
Application instance would be restarted after only one single health check failure.
Possible Fix
Adding failure threshold for health check, and after health check fails failureThreshold times in a row, the runtime considers that the overall check has failed and the container is not healthy/live.
The text was updated successfully, but these errors were encountered: