Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up a /healthcheck endpoint that can be monitored #2156

Open
edeutsch opened this issue Oct 4, 2023 · 1 comment
Open

Set up a /healthcheck endpoint that can be monitored #2156

edeutsch opened this issue Oct 4, 2023 · 1 comment

Comments

@edeutsch
Copy link
Collaborator

edeutsch commented Oct 4, 2023

Based on today's AHM discussion:
It would nice to have something like a /healthcheck endpoint that could report substantial problems and could be monitored.

So for example,

  • When /healthcheck is called, it could make sure that the KP info cache is less than 30 minutes old
  • There aren't a bunch of stale active processes
  • Relay any major errors*

It can start small, but ideally be flexible so that we could add more health checks, too.

Footnote* I have often mused about somehow have a response.error() option that is something like tell_a_human=True or something, where not only did the processing end in error, but this condition really ought to be relayed to an administrator rather than buried in a log file that no one is likely to read.

@saramsey
Copy link
Member

This would be nice for the ITRB endpoints in particular, where we can't even log in to poke around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants