Skip to content
This repository has been archived by the owner on Jul 27, 2022. It is now read-only.

Question: Is it ok that getReadinessStatus blocks until all startup checks pass? #62

Open
arcesino opened this issue Sep 8, 2020 · 0 comments

Comments

@arcesino
Copy link

arcesino commented Sep 8, 2020

Hello,

First of all, thanks for all the great work on this repo. So, I found out that if I register a startup check with a promise that takes some time to resolve and then call getReadinessStatus() directly or indirectly (through the /ready endpoint) the call blocks until the startup promise resolves. This contradicts what is in the readme's section Using CloudHealth:

If the startup promises are still running, calls to getLivenessStatus(), getReadinessStatus(), and getStatus(), return STARTING

In this case, the startup promise is still running but I don't get STARTING status. What I'm observing is that I can only get either UP if the promises resolves or DOWN if the promise rejects. Looking at these tests:

  • 'Readiness reports UP with a pending startUp check that will resolve'
  • 'Readiness reports DOWN with a pending startUp check that will fail'

It seems that's indeed the current behavior but I think it could be problematic. For instance, let's say that I have an app that requires from 30s to 60s to start. Then I will create a startup check whose promise will take at least 30s to resolve. Finally, I will use /ready endpoint to configure a readiness probe in k8s to run and set the interval to 30s and a 10s timeout. The result will be that the first probe will hang on my service and will be timed out by k8s after 10s and the same will happen with all subsequent calls until one of the probes happen to succeed given that the startup promise resolves within the 10s timeout of the probe.

I think this could lead to heavy resource usage on the service's end due to hanging/long-running connections. I think it would be better if we could simply collect the current status of a check without having to wait for the check's promise to resolve. For reference, this is the code I'm using to test:

const express = require("express");
const health = require("@cloudnative/health-connect");

const app = express();
const port = 3000;

(async () => {
  const appStartCheck = () =>
    new Promise(function (resolve, reject) {
      setTimeout(function () {
        resolve();
      }, 30000);
    });

  const healthChecker = new health.HealthChecker();
  const check1 = new health.StartupCheck("event:app-ready", appStartCheck);
  healthChecker.registerStartupCheck(check1);

  app.use("/ready", health.ReadinessEndpoint(healthChecker));

  app.listen(port, () => {
    console.log(`Example app listening at http://localhost:${port}`);
  });

  const status = await healthChecker.getReadinessStatus();
  console.log(JSON.stringify("I see this message only after 30s");
})();
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant