Logs enricher when use prometheus does not show the graph for node utilisation #1168

antikilahdjs · 2023-11-14T20:27:58Z

Describe the bug

Hey, I am using the Robusta some days ago with Prometheus plus Alertmanager but the enricher for prometheus like memory or cpu is empty when fired to teams as alert. Only the graph for memory usage appears in the graph but the node utilisation is not showing.

The image below will demonstrate it

To Reproduce
Steps to reproduce the behavior:
1 - Install using the official helm charts
3 - Configure Prometheus and Alertmanager
4 - Configure the SINK to use Teams
5 - In my global env has been configured like below

globalConfig:
  grafana_url: ""
  grafana_api_key: ""
  grafana_dashboard_uid: ""
  alertmanager_url: "http://alertmanager-operated.thanos:9093"
  prometheus_url: "http://thanos-query-frontend.thanos:9090"
  signing_key: ""
  account_id: 695c3053-0e56-xxxxxxxxxxxxxxxxxxxxxx
  custom_annotations: []

5 - My trigger and action is:

- triggers:
  - on_pod_oom_killed:
      rate_limit: 3600
  actions:
  - pod_oom_killer_enricher: {}
  - logs_enricher: {}
  - pod_node_graph_enricher:
      resource_type: Memory
      display_limits: true
  - oomkilled_container_graph_enricher:
      resource_type: Memory
      display_limits: true
  stop: true

Expected behavior

The graph woks for both sides, node utilisation and pod utilisation

Screenshots
It was added above

Desktop (please complete the following information):

OS: RedHat 8.5 and Ubunut 20.04LTS
Browser: Chrome
Version: 119

Additional context
Add any other context about the problem here.

saireddyb · 2023-11-24T04:23:22Z

Yes I to receive the node graph empty.

wrbbz · 2024-05-16T08:49:19Z

Same here on Robusta 0.12.0 without UI integration

Bobses · 2024-05-16T12:24:32Z

Same here on Robusta 0.12.0 without UI integration

Same here.

arikalon1 · 2024-05-16T15:53:59Z

@Bobses @wrbbz do you see any exception in the robusta-runner pod logs ?

aantn · 2024-05-17T11:21:32Z

Hi all, I believe this is because robusta is using the recording rule instance:node_memory_utilisation:ratio which isn't present in your environment.

If that is the case, we should be able to fix this by replacing instance:node_memory_utilisation:ratio with it's definition or possibly just by
container_memory_working_set_bytes{node="${node_name}", container!=""}

aantn · 2024-05-17T11:22:45Z

To help us get to the bottom of this, can each of you please verify that the metric instance:node_memory_utilisation:ratio is in fact missing from your environment.

wrbbz · 2024-05-17T15:13:11Z

Yeah. I can confirm that we do not have instance:node_memory_utilisation:ratio. Only container_memory_working_set_bytes

Bobses · 2024-05-20T09:02:55Z

I confirm that we don't have that record.

So, I'll add the following record:

record: instance:node_memory_utilisation:ratio
expr: 1 - (node_memory_MemAvailable_bytes{job="node-exporter"} or (node_memory_Buffers_bytes{job="node-exporter"} + node_memory_Cached_bytes{job="node-exporter"} + node_memory_MemFree_bytes{job="node-exporter"} + node_memory_Slab_bytes{job="node-exporter"} ) / node_memory_MemTotal_bytes{job="node-exporter"})

Thank you!

aantn · 2024-05-20T09:04:14Z

Yep, that will fix the problem. (Please confirm!)

I think we should also change this on our side to query the expr instead and not rely on that recording rule.

wrbbz · 2024-05-21T10:37:11Z

I've created a PR on usage definitions instead of records

Also, adding record to the Prom instance solved No Data error

pavangudiwada added the needs-triage This issue should be reviewed and tagged appropriately label Nov 15, 2023

wrbbz mentioned this issue May 21, 2024

Usage of records definition instead of records itself #1432

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logs enricher when use prometheus does not show the graph for node utilisation #1168

Logs enricher when use prometheus does not show the graph for node utilisation #1168

antikilahdjs commented Nov 14, 2023

saireddyb commented Nov 24, 2023

wrbbz commented May 16, 2024

Bobses commented May 16, 2024

arikalon1 commented May 16, 2024

aantn commented May 17, 2024 •

edited

aantn commented May 17, 2024

wrbbz commented May 17, 2024

Bobses commented May 20, 2024

aantn commented May 20, 2024

wrbbz commented May 21, 2024

Logs enricher when use prometheus does not show the graph for node utilisation #1168

Logs enricher when use prometheus does not show the graph for node utilisation #1168

Comments

antikilahdjs commented Nov 14, 2023

saireddyb commented Nov 24, 2023

wrbbz commented May 16, 2024

Bobses commented May 16, 2024

arikalon1 commented May 16, 2024

aantn commented May 17, 2024 • edited

aantn commented May 17, 2024

wrbbz commented May 17, 2024

Bobses commented May 20, 2024

aantn commented May 20, 2024

wrbbz commented May 21, 2024

aantn commented May 17, 2024 •

edited