Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude devices/VMs via a config context setting #167

Open
willfurnell opened this issue Mar 4, 2024 · 7 comments
Open

Exclude devices/VMs via a config context setting #167

willfurnell opened this issue Mar 4, 2024 · 7 comments

Comments

@willfurnell
Copy link

willfurnell commented Mar 4, 2024

Hello,
Would it be possible to exclude devices/VMs being shown in the api/plugins/prometheus-sd/devices/ API endpoint please?
For context - we're looking at making a low weight config context with something like

{
    "prometheus-plugin-prometheus-sd": {
        "monitor": "false"
    }
}

in it. Then a higher weight one that we can apply to certain roles, sites etc. that has more useful information:

{
    "prometheus-plugin-prometheus-sd": {
        "monitor": "true",
        "metrics_path": "/metrics",
        "port": 9100,
        "scheme": "http"
    }
}

This would then allow us to get just the devices we'd like to monitor, without having to filter by roles in the API and setting the correct config context for roles inside Netbox for instance. Effectively, by only changing config contexts in Netbox your list of devices to monitor is automatically updated without you needing to change any Prometheus config at all!

e.g. you wouldn't need to do api/plugins/prometheus-sd/devices/?role=x&role=y any more because you can do it all via Netbox.
What do you think?
Thanks!

@willfurnell
Copy link
Author

Or alternatively, what about a config option (to retain backwards compatibility) to only show devices at the API endpoint if they have something configured in the prometheus-plugin-prometheus-sd config context (as otherwise we don't want to monitor them)?

@k0ste
Copy link
Contributor

k0ste commented Mar 4, 2024

It's very unclear, what exactly do yo want to filter
The most generic way is a setup prometheus once, then SD works on you. For us it's look like

http_sd_configs:
  - authorization:
      type: Token
      credentials: <secret>
    refresh_interval: 10m
    url: https://<netbox>/api/plugins/prometheus-sd/devices/?status=active&has_primary_ip=true&tenant__n=null&cf_prometheus_server={{ inventory_hostname }}

@willfurnell
Copy link
Author

willfurnell commented Mar 5, 2024

Sorry I'll try and be more clear!

So by default, the devices and virtual machines API endpoints return literally all devices/VMs. We do not plan to install node_exporter on every single device and VM - and in fact we can't, because lots of devices are things like appliances, patch panels, switches and power distribution units.

Currently, we could use the Netbox filters as part of the API URL like you show, e.g. https://<netbox>/api/plugins/prometheus-sd/devices/?status=active&has_primary_ip=true&tenant=sometenant&role=somerole which is fine, but it means whenever we want to add a new role or device category to monitor, we'd need to update our http_sd_configs configuration in Prometheus (as we're slowly rolling it out across the estate).

We already need to use the config context in Netbox to define the port, scheme and metrics URL - like so:
Screenshot 2024-03-05 at 08 10 08
and the cool thing about config contexts is that we can conditionally apply them to things based on a load of factors, such as Roles as seen there, but also sites, tenants etc. - so we can make multiple config contexts with really specific (or not so specific) apply rules.

What would be really useful is that if we could configure the API endpoint to only return devices and VMs with the config context (with no other query paramaters, like this: https://<netbox>/api/plugins/prometheus-sd/devices/) sapplied to them, and then that means the data returned will just be devices we want to monitor, and if we want to change which devices are monitored, we only need to change the config context(s) in Netbox without even needing to touch the Prometheus configuration at all - because the URL used in Prometheus will never change - effectively we do all the filtering on the Netbox side!

@k0ste
Copy link
Contributor

k0ste commented Mar 5, 2024

For filtering like you describe we use service templates.
Screenshot 2024-03-05 at 11 20 50
So an administrator who wants to scrape, for example, an SSL certificate or the S3 bucket, configures a service on the device or virtual machine and a field that the exporter needs to go to. This is, so to speak, a classic way to configure Prometheus as a service

- job_name: s3_exporter
  metrics_path: /probe
  relabel_configs:
  - separator: ;
    regex: __address__
    replacement: nb_fqdn
    action: labelmap
  - source_labels: [__meta_netbox_services]
    separator: ;
    regex: (.*)(s3_exporter)(.*)
    replacement: $2
    action: keep              #<------------ keep targets where this service is assigned
  - source_labels: [__meta_netbox_custom_field_s3_exporter_bucket]
    separator: ;
    regex: (.*)
    replacement: $1
    action: keep              #<------------ keep targets where bucket is defined 
  - source_labels: [__meta_netbox_custom_field_s3_exporter_bucket]
    separator: ;
    regex: (.*)
    target_label: __param_bucket
    replacement: $1
    action: replace
  - source_labels: [__meta_netbox_primary_ip]
    separator: ;
    regex: (.*)
    target_label: __address__
    replacement: $1:9340
    action: replace
  - source_labels: [__meta_netbox_tenant_slug]
    separator: ;
    regex: (.*)
    target_label: tenant
    replacement: $1
    action: replace
  - source_labels: [__meta_netbox_role_slug]
    separator: ;
    regex: (.*)
    target_label: role
    replacement: $1
    action: replace
  - source_labels: [__meta_netbox_custom_field_environment]
    separator: ;
    regex: (.*)
    target_label: environment
    replacement: $1
    action: replace
  http_sd_configs: #<---------------- support devices & virtual-machines
  - authorization:
      type: Token
      credentials: <secret>
    refresh_interval: 10m
    url: https://<netbox>/api/plugins/prometheus-sd/devices/?status=active&has_primary_ip=true&tenant__n=null&cf_prometheus_server={{ inventory_hostname }}
  - authorization:
      type: Token
      credentials: <secret>
    refresh_interval: 10m
    url: https://<netbox>/api/plugins/prometheus-sd/virtual-machines/?status=active&has_primary_ip=true&tenant__n=null&cf_prometheus_server={{ inventory_hostname }}

As I understand it, you somehow want to transfer the exporter settings, such as port and metrics_path, to the NetBox side. It seems to me that in this case the flexibility of the keep/drop actions and fill __params will be lost.

@willfurnell
Copy link
Author

Does this stop the devices from being scraped in the first place - as that's what I'm mainly trying to avoid, as at the moment my scrape target stats look like this - just for one tenant!
Screenshot 2024-03-05 at 09 56 40

@k0ste
Copy link
Contributor

k0ste commented Mar 5, 2024

The keep/drop is include/exclude target from scrape targets for a service. This mean if your default service is node_exporter (generic exporter) and u wanna setup additional exporter like ssl_exporter: you generic exporter will always scrape all your targets, and additional exporter will be added to targets only if setuped properly

@willfurnell
Copy link
Author

Okay this may be a bit of a hack - but I've figured out what I want to do...
I've added this in my Prometheus config under relabel_configs

  - source_labels:
    - __address__
    regex: "(.*):999999999"
    action: drop

and then I've added an additional config context in Netbox with the following content, applying to everything and priority 1.
image
This means that by default all devices get port 999999999 assigned.
I then have a config context with a higher priority that is applied to only the roles etc. that I actually want to monitor:
image

Then my Prometheus config means that all roles that I don't want to monitor will not be scraped. Importantly, in the future if I want to change the roles that I scrape, I don't have to change my prometheus configuration at all - I can do it all within Netbox - because the http_sd_configs URL never changes from https://netbox/api/plugins/prometheus-sd/devices/?tenant=xyz&has_primary_ip=true!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants