Alerting: added time based restrictions to alert rules #33075
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
anyone else sick of these +1 emails that go on for years and years and years....??
well i had some extra time this week and felt like giving this a
go
;)here's what the alert tab looks like with my changes
features:
tried to minimize touch points as i'm pretty sure this entire alerting framework will change sooner than later. actually had a merge conflict even though i pulled a few days ago.
it feels like the alerts started as a super simple straightforward query check (1 condition). then as different needs came up, it slowly got augmented which all the other bells and whistles we see today, like thresholds, frequency, emails, dashboards, etc. however, under the covers, there is still a strong dependency to check specific 'conditions'. based on my interaction with it so far, it doesn't really seem like this is a sustainable design moving forward.
also see this ticket comment: #7832 (comment)
i really had to shoe-horn the idea of time to the backend evaluators, especially the idea of a 'parent' time for the whole panel. is it the cleanest code i've written? i'm sure there's a much cleaner way to incorporate with the existing AlertEvaluator's, but like i said, there's a lot of churn in these files and i'm trying to balance my time with minimizing touch points.
some gotchas:
-- there is a context level 'starttime' parameter that gets set when the alerting check is kicked off (driven by top-level 'evaluate every' input)
-- that time parameter is what drives the alert range time and there is no guarantee when it actually runs, so be careful about super specific time requirements, probably not good enough for 12:00:01
this fork is off of latest master 4/15/21, but i'm thinking this could be backported to whatever version first had an alerting framework. the files in the framework have changed slightly so i don't think the patch will necessarily apply cleanly, however it should be obvious where the changes go.
i have a pretty good handle on the alerting framework now, so if you guys want anything else lemme know.
unless it's javascript... i battled that render maze long enough today.
ps grafana hmu, my fees are reasonable