Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

health-checks: zero route availability improvements #5111

Merged
merged 2 commits into from May 17, 2024

Conversation

wasaga
Copy link
Contributor

@wasaga wasaga commented May 10, 2024

Summary

This PR does some improvements to the Zero Route Availability health check:

  • ignores certificate errors, as the point of that health check is not to check the validity of the cert, but rather whether there is an expected pomerium instance available at a given hostname
  • if check fails, re-check it with exponential back-off
  • previously it was only ran on schedule (every 30 minutes), now it would also be triggered on configuration change
  • also adds an "internal error" reason for health checks when health check cannot complete because of some external condition unrelated to the health check itself

Related issues

Fixes: https://github.com/pomerium/pomerium-zero/issues/2360

User Explanation

Checklist

  • reference any related issues
  • updated docs
  • updated unit tests
  • updated UPGRADING.md
  • add appropriate tag (improvement / bug / etc)
  • ready for review

@wasaga wasaga requested a review from a team as a code owner May 10, 2024 14:02
internal/zero/healthcheck/check_routes.go Dismissed Show dismissed Hide dismissed
@coveralls
Copy link

Coverage Status

coverage: 56.37% (-0.1%) from 56.512%
when pulling 7d8af09 on wasaga/route-availability-check-improvement
into 568e99f on main.

"github.com/pomerium/pomerium/pkg/protoutil"
)

func (c *checker) ConfigSyncer(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should the health checks sync config separately from the rest of the app?

What are the difference between this approach compared with registering an OnConfigChange() listener on the main config source?

Copy link
Contributor Author

@wasaga wasaga May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a fair point, however here we only have access to the lowest layer of config.
by design higher layers are notified when lower layer configs are changed but not the other way around.

we would have to move the checker loop initialization to an entirely different place to be able to attach to a regular config.Source; there are benefits doing that; I can make a separate PR for that.

@wasaga wasaga merged commit 8269a72 into main May 17, 2024
16 checks passed
@wasaga wasaga deleted the wasaga/route-availability-check-improvement branch May 17, 2024 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants