Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a /health endpoint which does not require authentication #147

Open
anguslees opened this issue Mar 28, 2021 · 6 comments
Open

Add a /health endpoint which does not require authentication #147

anguslees opened this issue Mar 28, 2021 · 6 comments

Comments

@anguslees
Copy link

Trying to deploy restic/rest-server:0.10.0 in k8s. I want a URL endpoint that I can use for "livenessProbe" (GET, should return 200).

As far as I can see from some testing, guided by mux.go - everything returns 401 Unauthorized. At the moment, I'm using a naive tcp 'liveness' probe, which is not nearly as expressive/accurate.

It would be nice if there was a /health endpoint (or similar) which did not require authentication, and returned 200 (assuming the server was ok).

@Enrico204
Copy link
Contributor

I created a WIP pull request with a proposal for a /health endpoint. For now, the endpoint checks for free space (at least 8MB *) and that the repository path is writable. Should we check for more things?

Maybe we can add some other checks when/if external auth backends will be implemented (see #111 and #70 )

  • I picked the value as the size on one pack, however I don't know if it's correct. Feel free to propose a different value

@MichaelEischer
Copy link
Member

Which checks do we really need for a /health endpoint? Checking whether there is enough free space to store at least one pack file is mostly of academic interest. After all it will only take a very short time to fill up the free space completely at which point the server will stop accepting new uploads. And I don't think that there's a right answer to what the limit should be, probably every admin will want to use different limits.

If I understand liveness probes correctly, they are used to restart stuck containers. That is a failed health probe would cause container restarts. However, restarting the rest-server container once a disk has run full is highly problematic as that would prevent (read) access to the backup repositories.

And I guess a similar reasoning applies to whether the repository path is writeable (although that might be less of a problem).

@anguslees
Copy link
Author

Agreed, the focus should be on whether killing+restarting this container would help. This isn't a substitute for more comprehensive monitoring.

A quite reasonable first step is to do no extra logic and just respond with 200 ok immediately, from your main http event handler. Even that trivial check still confirms that the program is running, is listening on the correct port, has completed any startup steps, isn't in deadlock or oom-thrash, etc.

Just for completeness, don't make 'healthiness' depend on reachability/health of some other remote service. This is a common error and leads to cascading failures.

@wojas
Copy link
Contributor

wojas commented Oct 18, 2021

It would make sense to just add a handler for /health that always returns 200.

I can think of the following additional things to check:

  • Check if the .htpasswd file is readable if htpasswd auth is enabled.
  • Check if the repo root directory exists.

Restarting rest-server will not resolve any of these, but I can imagine that the failure state of the container/pod is useful to administrators. But I think that rest-server will actually fail to start in those cases anyway, in which case adding these checks is not that useful.

As discussed, free disk space is something that can and should be monitored outside of rest-server.

Perhaps we could add a Prometheus metric for write errors?

@wojas
Copy link
Contributor

wojas commented Oct 18, 2021

As a workaround, you could set the -prometheus-no-auth flag to disable auth on the /metrics endpoint, if you do not mind exposing the metrics or have a reverse proxy that can restrict access to that path in front of the service.

@queeup
Copy link

queeup commented Feb 21, 2024

I would love to have this for checking if my rest-server up before triggering remote backup with systemd.

Right now I am using this for check:

curl --silent --fail --head -L http://192.168.1.100:8000/myrestic-backup-repo-name/config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants