Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting support for queries using template variables #6557

Closed
calind opened this issue Nov 12, 2016 · 142 comments
Closed

Alerting support for queries using template variables #6557

calind opened this issue Nov 12, 2016 · 142 comments
Labels
area/alerting/evaluation Issues when evaluating alerts area/alerting Grafana Alerting type/feature-request

Comments

@calind
Copy link

calind commented Nov 12, 2016

It would be pretty useful if grafana would support alerting for queries using template variables. The way I see it work it would be as follows:

  1. Generate queries foreach template variable combination (discarding template variable for all)
  2. When generating queries, consider the frozen list if the template variable is set to never refresh, else update the template variable list
  3. Allow filtering (trough regex or by providing a static value) for each template variable

The current workaround is to use an invisible wildcard metric, but the problem I see with this approach is that it loses context.

@oiooj
Copy link
Contributor

oiooj commented Nov 13, 2016

+1

@bergquist
Copy link
Contributor

  1. What would be the difference compared to just using all?

@bergquist bergquist added area/alerting Grafana Alerting area/alerting/evaluation Issues when evaluating alerts type/feature-request labels Nov 14, 2016
@antoinerrr
Copy link

+1
Would be nice to be able to add alerting on server with a low life time (AWS auto scaling), auto register the server on grafana is easy with the templating but it's sad to not be able to put alerting on them

@calind
Copy link
Author

calind commented Nov 15, 2016

@bergquist it's unpractical using all for example when you have more than a dozen hosts.

nivex6impyskjxkpmldv

If for example only few of them are failing, (let's say 5), it is very useful to receive an email for each failing alert. This way is also much easier to integrate with other tools which in general expect one alert per metric.

The current approach (using all) is pretty neat though when there are fewer instances or when you are alerting at service level (eg. # of jobs in queue).

@Deshke
Copy link

Deshke commented Nov 15, 2016

what @calind said, i've got multiple $host variables wich are working fine with the influxDB but not with the alerts

@NotSoCleverLogin
Copy link

+1 as well.

Just a thought, since you are able to query with a template variable, wouldn't you just be able to do the same query with the alerting metrics and maybe iterate through the results to see which meet the alert criteria?

@bergquist
Copy link
Contributor

@NotSoCleverLogin It would be possible. But would you want to change the behavior of alert rule based on what template varlue are selected?

Using the all option for the template is the only way that makes sense for me.

@mstaalesen
Copy link

+1

I have a setup of X environments with the same components in each environment. We are currently using prometheus to alert on e.g cpu usage/disk usage etc. There we specify an alert for a query, and when the alert is triggered it will just state which environment the alert was triggered from.

If we would do this with the All variable, that would work to some extent. But, using @calind's example, the screenshot would be filled with the trend of all cpus from all of my environments, and not just the environment where I would want to be informed about said problem. The graph will (or can) be obscured with information from other environments. In some scenarios it could be interesting to compare cpu in other environments, but there are no guarantees that what is happening in a test environment is happening in our production environment, etc.

We are also looking into creating dashboards that can be used by operations, showing annotations for alerts in the "standard" overview dashboard. Given that we use 'env' template variables for these kind of dashboards it's not really possible for us to do that with how it is implemented right now. I would have to manually (at least to some extent) generate a "shadow" dashboard where the alerts are triggered (which makes me loose the annotations in the overview dashboard).

Another thing I think template variables can help you do is to route the alerts (should you choose to implement such a feature) to different sources (some to operations if in production, to qa/developers if in test environments etc).

@StianOvrevage
Copy link

+1 for supporting alerts on templated queries.

@calind
Copy link
Author

calind commented Nov 24, 2016

@bergquist, some dashboards don't have an All option. For example system metrics by collectd (https://grafana.net/dashboards/24). Having an All option would certainly not be practical for let's say 10 or more servers. That's why the need to iterate trough template variables.

@StianOvrevage
Copy link

Allowing use of All is a good and welcomed start.

In Prometheus, queries need to be written in a different way to allow All:

some.metric{hostname=~"$Hostname"}

Notice the extra tilde there, allowing for regular expression searching (and the wildcard in All).

I have not benchmarked the possible performance impact of going from a straight query to a regex search query but at least for now it would apparently solve our problems.

@max3163
Copy link

max3163 commented Nov 29, 2016

+1

1 similar comment
@jordandev
Copy link

+1

@steverweber
Copy link

steverweber commented Dec 2, 2016

not sure how it should be implemented, just know it's needed..

@Krylon360
Copy link

+1
We use Prometheus as the Datasource to monitor our Kubernetes Infrastructure for bout our On-Prem K8S Clusters and our AWS K8S Clusters.
All of our dashboards use Templated Variables for the Datasource ($Environment), $Instance/Node, $Namespace, and $Pod.
Due to the way the Prometheus Query Structure is; all of the queries have Templated Variables; which prevents the Alert Rules from allowing to save.
I would love to see Templated Variable Queries added to the alerting.

@andrewawagner
Copy link

+1

@shervinkh
Copy link

+1
We use templating dashboards for multi-server environment which is the logical way (and many people use), So we can't use alerting with grafana right now. The only way is to have a separate non-templating dashboard or setup alerting with prometheus itself which is not easy.

@steverweber
Copy link

steverweber commented Dec 8, 2016

perhaps if there was an option or simple way to save/export a dashboard with the template variables backed/pre-rendered into all the fields... this would perhaps be a good half way point until another solution is found.

@daraeburn
Copy link

+1 for supporting alerts on templated queries. We currently use templating on all our dashboards so can't take advantage of this really cool feature.

@tsn77130
Copy link

tsn77130 commented Dec 12, 2016

+1, we have a lot of templated dashboards, and we can't use alerting for now, we have to deduplicate dashboards for having alerts, and we so lose templating power

@drewboswell
Copy link

drewboswell commented Dec 12, 2016

+1, Almost all of our dashboards use template variables (and nested template variables).

We would like to be able to set alerts on repeat panels to get individual alerts per template-variable group if needed. Plus this means that the alerting is dynamic and not super manual as it is now.

DANGER: Variables in theory will be good to have, but we need to keep in mind that if some guy goes into your dashboard and changes the value and saves, the resulting alerting will be affected. Don't know if that's ok behaviour or not, will be complicated.

@ebirukov
Copy link

+1

@erSitzt
Copy link

erSitzt commented Dec 12, 2016

When working with grafana it feels like templating is encouraged everywhere and it feels wrong to create an extra set of graphs not using variables just to use the alerting feature...

@kanwangzjm
Copy link

+1 for supporting alerts on templated queries.
also, we found that when we use Chinese ruleName or Chinese title, we received abnormal email with rule triggered. For example, we expected “个股分时线接口请求时间(getTimeTrend) alert” but received "个è�¡å��æ�¶çº¿æ�¥å�£è¯·æ±�æ�¶é�´(getTimeTrend) alert", maybe the charset is not correct.

@jessover9000
Copy link
Contributor

jessover9000 commented Jan 14, 2021

Hello Grafana community, the Grafana team has picked up the work on Alerting and we're in the process of redesigning it to make the best possible alerting experience happen 🔥 🚀 We would love to find out more about your needs as our beloved users. So if any of you are willing to have a 30-minute interview with me, please just send me an empty e-mail and I will get in touch.

Update: I got so many e-mails in such a short time, you all rock! I'll be reaching out to everyone who sent e-mails, we have enough interviewees now, thank you <3

@kylebrandt
Copy link
Contributor

The new beta version of alerting in Grafana 8 (opt-in with "ngalert" feature toggle) has moved alerts out of dashboards, so alert rule queries based on template variables no longer fits into the model.

The new version does support "multi-dimensional" alerting based on labels which seems to be close to the core functionality underlying this issue. So one can have multiple alert instances from a single rule. More at #7832 (comment) .

@grafana grafana unlocked this conversation Jun 8, 2021
@fkaleo
Copy link

fkaleo commented Jun 9, 2021

The new beta version of alerting in Grafana 8 (opt-in with "ngalert" feature toggle) has moved alerts out of dashboards, so alert rule queries based on template variables no longer fits into the model.

The new version does support "multi-dimensional" alerting based on labels which seems to be close to the core functionality underlying this issue. So one can have multiple alert instances from a single rule. More at #7832 (comment) .

Is there a way to cover this use case:
As a user, if I see a metric in a dashboard with template variables that I would like to create an alert for, is it possible to easily create that alert (in the ngalert model) that is based on that query (with the template variables replaced by their current values)

@kylebrandt
Copy link
Contributor

@fkaleo It doesn't seem to currently do this. The resolution of those variables when clicking the "create alert" from the dashboard in the system would be on the frontend side which I know less about it. But it sounds doable and like a good idea to me - can you create a new enhancement request for that?

@UAnton
Copy link

UAnton commented Jun 10, 2021

Can't find ngalert feature toggle

@kylebrandt
Copy link
Contributor

Can't find ngalert feature toggle

If using .ini files for your configuration, it would be adding the following to the config:

[feature_toggles]
enable = ngalert

@anybodysguest
Copy link

@torkelo I've been keeping up to date with the latest Grafana releases hoping that eventually I'd be able to use variables in alert queries. I tried ngalerts and while I didn't spend a lot of time with them I had the following thoughts:

  1. When this feature is turned on, existing alerts become invalid. Non-ngalerts don't co-exist with the newer ngalerts.
  2. I found that ngalerts was overly complicated, at least compared to the non-ngalerts version.
  3. While ngalerts may bring the ability to use template variables I didn't test it due to the fact I was turned off by the fact that all of my existing alerts would effectively have to be re-written.
  4. I use my alerts with a web hook to xMatters. Previously I was able to add tags to my alerts and those tags would be sent to xMatters. My tags specify the type of alert is being fired (for instance CANARY) and the severity(example: High). When the xMatters webhook receive that webhook, I can "unpack" those tags to determine that the flow my incident workflow will take. With ngalerts this no longer works because the format of the payload being sent to xMatters has changed. There is no longer a Tags array in the payload.

So for net-new dashboards and alerting ngalerts probably is a great thing. But what happens to many of us who are not using ngalerts and simply want to use a contant template variable in the alert query or in the alarm message itself?

I still don't understand why template variables can't be used with alert queries. I use template variables heavily, whenever I can so that I have precise control over my dashboard queries. In that respect there is no problem. It is only when I have an alert attached that things go sideways. That is an "impedance-mismatch" I don't understand.

@Guanpeng520
Copy link

Hi all, had a question about adding alerts.
[https://grafana.com/docs/grafana/next/alerting/old-alerting/add-notification-template/]( look as document)

it’s suggested that the query triggering the alert can have labels from the query templated into the alert message. As per the documentation, this is how:

Refer to the alert query labels in the alert rule name and/or alert notification message field by using the ${Label} syntax.

Uploading image.png…

Who knows where the ${Label} was created? Thanks so much.

@lujinke
Copy link

lujinke commented Mar 22, 2022

Yes, I don't understand why the Grafana dev team does not want to support this while so many many people are eager for this.

@DEvil0000
Copy link

DEvil0000 commented Mar 22, 2022 via email

@gvidasja
Copy link

gvidasja commented Apr 4, 2022

hi, not sure if such idea was mentioned before, but maybe it's possible to have two sets of variable values:

  • the current ones that are used for displaying the dashboard
  • ones used when performing alert queries. They could be set somewhere in the alert settings

this way alerts would not be affected by someone changing variable values and saving the dashboard.

Something like this:
image

@Tomasz-Kluczkowski
Copy link

Any progress on this Grafana? Sorry to be a nuisance but this pretty much invalidates the template variables for me.
If I use a variable to define graphs but I cannot use it for alert and need to re-do the query without a variable, what is the point?

@amohamedhey
Copy link

Hey, Any update on this?

@svet-b
Copy link

svet-b commented Sep 21, 2022

It's not quite clear what sort of further updates people are expecting on this. This feature has been implemented (per comment above) and the issue is correspondingly closed.

Additional comments (and I'm conscious I'm right now a culprit here...) simply generate spam for the 800+ people still subscribed to the issue. Follow-up bug reports or enhancement requests ought to be captured in separate issues, while questions about the functionality can be addressed in the community forum.

@anybodysguest
Copy link

@amohamedhey @svet-b @Tomasz-Kluczkowski My understanding is that this ability was added to Grafana for Unified Alerting (introduced in Grafana 8). I don't think a lot of new feature work will be going into the legacy alerts as Unified Alerting is the "new normal". That is my understanding. @svet-b please correct me if I'm wrong.
So, that said, if you are using legacy alerting, you still won't be able to use template variables in the alert queries. If you do you will be warned, if you proceed to use them they will disable the alert. If you have the ability to easily switch to Unified Alerting, I'd do that. I don't think Legacy and Unified Alerting can be used at the same time on a Grafana installation; I haven't checked to see if that's possible.

@dylan-tao
Copy link

Hey, Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/alerting/evaluation Issues when evaluating alerts area/alerting Grafana Alerting type/feature-request
Projects
None yet
Development

Successfully merging a pull request may close this issue.