Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alert not working with templating #6230

Closed
3 tasks
qingweiqm opened this issue Oct 11, 2016 · 43 comments
Closed
3 tasks

alert not working with templating #6230

qingweiqm opened this issue Oct 11, 2016 · 43 comments
Assignees
Labels

Comments

@qingweiqm
Copy link

  • I'm submitting a Bug report*
  • Bug report
  • Feature request
  • Question / Support request: Please do not open a github issue. Support Options

when using templating with host, the alert function is working abnormal. if I clear the templating setting in graph, the alert workings fine.

Please include this information:

  • What Grafana version are you using?
    the newest version
  • What datasource are you using?
    influxdb
  • What OS are you running grafana on?
    centos 7
  • What did you do?
    using templating with alert
  • What was the expected result?
    should work together
  • What happened instead?
    alert works abnormal

IMPORTANT If it relates to metric data viz:

  • An image or text representation of your metric query
    2016-10-11 3 48 38
    2016-10-11 3 48 21
    2016-10-11 3 48 08
  • The raw query and response for the network request (check this in chrome dev tools network tab, here you can see metric requests and other request, please include the request body and request response)
@bergquist
Copy link
Contributor

Template variables are not supported in alerting yet. Seems like the warning about it does not support influxdb queries. Ill fix it

@bergquist bergquist self-assigned this Oct 11, 2016
@qingweiqm
Copy link
Author

when will it be supported ?

@bergquist
Copy link
Contributor

bergquist commented Oct 11, 2016

Not sure when templating will be supported.

How would you expect templating to work in this case?

@bergquist bergquist added the area/alerting Grafana Alerting label Oct 11, 2016
@zihaoyu
Copy link
Contributor

zihaoyu commented Oct 25, 2016

Any estimate on when this will support Graphite?

@antoinerrr
Copy link

Template variables are not supported in alerting yet. Seems like the warning about it does not support influxdb queries. Ill fix it

Would be very nice to have this

@davidlmaldonado
Copy link

Having same issue with Graphite as well.

@nmartinez-nimeops
Copy link

Same issue with Graphite too.
Will be great!!!

@jlutzhbo
Copy link

jlutzhbo commented Dec 7, 2016

Is there an open feature request? Looking for this as well. Thanks.

@bergquist
Copy link
Contributor

bergquist commented Dec 7, 2016

@jlutzhbo ref #6557

@AjeetK
Copy link

AjeetK commented Feb 15, 2017

Same issue with prometheus

@dugajean
Copy link

Since I'm not sure what the alternatives would be without using variables, I'd like to see this feature implemented as well.

@torkelo
Copy link
Member

torkelo commented Apr 27, 2017

the alternative is to write alert queries that are wider in scope, that include wildcards/regex so they target multiple series. No need for variables.

variables are for exploration & dynamic dashboards & dynamic filter.

@dugajean
Copy link

dugajean commented Apr 28, 2017

@torkelo First time using Grafana (or any other metrics related program) so I'm not very familiar with the concepts... So, if I have a query like this one netdata.*.system.cpu.system (where the wildcard is the server name), and I alert that if the CPU usage is higher than 30%, then it will alert if any of the server instances' CPU usage exceeded 30%?

@torkelo
Copy link
Member

torkelo commented Apr 28, 2017

yes, it will check all series. But currently it does not keep state per series. So if one server has high CPU the alert rule will trigger (With the info on which server) and send out a notification, but a minute later another server also has high cpu it will not trigger new notifications as the alert rule is still in firing state.

@MrMMorris
Copy link

so if I have a dashboard that creates dynamic graphs using Template Variables, then I should create a separate dashboard that doesn't use Template Variables that I would use for alerting on?

@torkelo
Copy link
Member

torkelo commented Jun 19, 2017

@MrMMorris yes

@nilapshah
Copy link

nilapshah commented Sep 22, 2017

I want to alert if replicaset lag is more than 3600 seconds. It should mail. Is it possible to do it Graphana ??
If Yes, Can anyone explain the steps

@mikob
Copy link

mikob commented Nov 11, 2017

@MrMMorris you don't need an entirely new dashboard, you can just duplicate the query, fill in the variable with what the alert should handle and hide the query from the graph.

@7yl4r
Copy link

7yl4r commented Dec 4, 2017

Templates define a view and thus it doesn't make sense to use them with alerting. But alerting can be set up using the template variable's underlying query, and I think in most cases this is what people are looking for. @bergquist 's comment over on #6557 sums it up well I think:

reusing the selected template variable for alerting would be dangerous since people can choose to view just one option and then forget to change back to All or something wider.
...
One solution for this problem would be to have two values for each template variable. One for visualization in the dashboard and one for alerting.

Something similar to the solution he describes is easy to set up once you know what to do (keep your "view" query using templates, and set up a hidden query using wildcards for the alert):

  1. duplicate the query with template variables you wish to alert on
  2. replace template variables with wildcards (use the same from your template var query)
  3. define an aggregation method for the set of series (eg avg, max, sum)
  4. hide the new query from the view
  5. set up templating using your new summary query

Here is an example set up with query "A" as my "display query" and query "B" as the "summary query" I use for alerting (aggregated with max over the series & tested w/ max in the alert tab as well).

image

image

I'm posting this and suggesting a change to the docs/message because 30min ago when I came across this thread I thought I was going to have to create a graph for each of my servers in order to make alerts work. I suspect a lot of the angst above comes from similar misunderstandings.

@fchiorascu
Copy link

fchiorascu commented Jan 3, 2018

Hi,

Sorry to open this topic.
I’ve tried to implement the alert in Grafana Dashboard but not working when you have a Dashboard with templating.
I’ve noticed that are some opened topics on GitHub regarding this subject: #6230, #6557.

–=Solution=–
The solution will be to have a Dashboard per Server in order to have alerting working?
vs
Other solution will be to have Dashboard per Services in order to have the alerting working.

Kind Regards,

Ex of templating:
image
image

I understood that is not possible to have Alert in Grafana per each server present in the same Dashboard.

If yes kindly give an example or the best recommendation.

@7yl4r
Copy link

7yl4r commented Jan 3, 2018

@fchiorascu have you tried my suggestion above? (Set up two queries: one for the "view" using templates, and a second hidden query using wildcards for alerting).

@MrMMorris
Copy link

MrMMorris commented Jan 3, 2018

@7yl4r so I have tried that technique and I can get it to alert, but it doesn't tell me what server it is alerting on. Is your solution supposed to provide that? Cause I couldn't find a way to figure out how to pass the specific server that is alerting.

@7yl4r
Copy link

7yl4r commented Jan 3, 2018

@MrMMorris : My solution is not able to tell you which series is alerting. For mine I can tell which series is alerting by looking at the graph included with the alert.

@fchiorascu
Copy link

I'll come back with a feedback till EoW regarding your proposal.
Thank you.

@ashuw018
Copy link

@7yl4r Hi, two things are there from your provided solution.

  1. It will not tell which series is causing alert.
  2. It will only alert when graph is not in alerting state. once any of the series cross threshold it will alert and graph will go into alerting state now if any other series also crosses threshold it will not alert as it is already in alerting state.

Due to this behavior of alerting, currently alerting withing grafana is difficult to use.

@7yl4r
Copy link

7yl4r commented Feb 16, 2018

You are absolutely correct on point (2).

As for (1), you are technically correct, but here is a screenshot of one of my alert messages:

image

It doesn't explicitly tell me which series is causing the alert, but I can clearly see it is the mbon.... disk. I am unable to imagine a case where an alert is triggered, but the problem series cannot be identified from the plot. If you can't tell when something is wrong from the plot, what is the point of the plot?

@MrMMorris
Copy link

MrMMorris commented Feb 16, 2018

I am unable to imagine a case where an alert is triggered, but the problem series cannot be identified from the plot. If you can't tell when something is wrong from the plot, what is the point of the plot?

I don't understand, are you arguing against making this feature available because you can't imagine why someone would need it? I can understand if this is what's available for the time being, but your wording makes it sound like you are pushing back against this being possible.

The fact that I have to go to my monitoring frontend, login, find the dashboard, find the graph, and decipher which is alerting is many more steps than seeing an email alert and connecting to a server.

@7yl4r
Copy link

7yl4r commented Feb 16, 2018

Sorry if that came off as dismissive. I was trying to say that (2) should probably be prioritized over (1) unless a user story can be defined to demonstrate the issue.

The fact that I have to go to my monitoring frontend, login, find the dashboard, find the graph, and decipher which is alerting is many more steps than seeing an email alert and connecting to a server.

Does this happen to you? That is a screenshot of the email I received; only one step. I guess there are probably other alerting configurations for which (2) would be more important.

@MrMMorris
Copy link

MrMMorris commented Feb 16, 2018

No worries, I didn't think that was your intention 😄 I will provide a rebuttal to being able to easily identify the server from the plot:

  1. I believe that in order to get the legend in the email screenshot, you would require the legend to be on for the graph on the dashboard? if so, that really mucks up dashboards with unnecessary information and space usage.

  2. Can you easily tell which server(s) go above 15GB in this graph? 😅

screen shot 2018-02-17 at 1 09 20 am

Hint: There are 3 servers that go above 15GB, but two of them overlap almost perfectly and one is barely visible on the black background and just barely pops above it ¯\(ツ)

@sparr
Copy link

sparr commented Feb 28, 2018

@MrMMorris our solution to the "mucks up the dashboard" is that we just have two different versions of each dashboard. One for people to look at in normal operations, one for alerts that only gets looked at when there's an alert.

@MrMMorris
Copy link

@sparr yea that has been suggested before, but solves IMO the least pertinent problem

@WafflesMcDuff
Copy link

Hi,

Sorry to be a thread excavator, but I was wondering if anything has changed in this regard?
Is there now a better way to setup alerting for multiple servers for the same metric without

  • creating a dashboard per-server or
  • getting an alert with many servers in it like the above examples or
  • alerts not triggering when one server in the query is already in the alerting state?

Kind regards,

Josh/Waffles

@sparr
Copy link

sparr commented Oct 18, 2018

@WafflesMcDuff to be clear, you just need one panel per server, not a whole dashboard.

My solution to this class of problem has been to make my own templating system with a script that uses the API to download a dashboard then reproduce panels/rows with the correct names in them. I use it for environments instead of hosts, but the principle is the same.

@nmartinez-nimeops
Copy link

really don't understand why it's not implemented...
The only reason why not using grafana for some projects.

@toddams
Copy link

toddams commented Oct 19, 2018

Guys, everyone is waiting for this to be implemented, and for this reason we are all subscribed to this thread to get an email once it is done. But instead we receive silly "+1" messages and questions about milestones. Have some patience please, or otherwise take the situation in your own hands and send a pull request. Thank you

@nmartinez-nimeops
Copy link

Lol, patience, of course: 2 years this feature was suggested.. and has been closed :D

@marefr
Copy link
Member

marefr commented Oct 19, 2018

@nimeops #6557

@maklaut
Copy link

maklaut commented Feb 8, 2019

+1

@eegiimgl
Copy link

Is this feature added in newer version?
There are different situations blocked by this no template variable in alerts.
For example, if you have only Constant type of variable and it does not have any different values on it. Then if you want to use that constant in the metrices and want to get alerts. It will be blocked. What can I use at this situation instead of variable as a constant?

@marrotte
Copy link

ping

@marefr
Copy link
Member

marefr commented Jun 25, 2019

Duplicate of #6557

@marefr marefr marked this as a duplicate of #6557 Jun 25, 2019
@marefr
Copy link
Member

marefr commented Jun 25, 2019

Please continue any discussions regarding support for variables in alerting in #6557.

@oussamaHJM
Copy link

Templates define a view and thus it doesn't make sense to use them with alerting. But alerting can be set up using the template variable's underlying query, and I think in most cases this is what people are looking for. @bergquist 's comment over on #6557 sums it up well I think:

reusing the selected template variable for alerting would be dangerous since people can choose to view just one option and then forget to change back to All or something wider.
...
One solution for this problem would be to have two values for each template variable. One for visualization in the dashboard and one for alerting.

Something similar to the solution he describes is easy to set up once you know what to do (keep your "view" query using templates, and set up a hidden query using wildcards for the alert):

1. duplicate the query with template variables you wish to alert on

2. replace template variables with wildcards (use the same from your template var query)

3. define an aggregation method for the set of series (eg avg, max, sum)

4. hide the new query from the view

5. set up templating using your new summary query

Here is an example set up with query "A" as my "display query" and query "B" as the "summary query" I use for alerting (aggregated with max over the series & tested w/ max in the alert tab as well).

image

image

I'm posting this and suggesting a change to the docs/message because 30min ago when I came across this thread I thought I was going to have to create a graph for each of my servers in order to make alerts work. I suspect a lot of the angst above comes from similar misunderstandings.

i'm new in grafana and having difficulties to make the second query any help ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests