Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus metrics support #375

Open
mschirrmeister opened this issue Apr 14, 2024 · 7 comments · May be fixed by #386
Open

Prometheus metrics support #375

mschirrmeister opened this issue Apr 14, 2024 · 7 comments · May be fixed by #386

Comments

@mschirrmeister
Copy link

Hello,

I was looking at #84 which is for the metrics implementation.

Is there somewhere an example on how to pull this into Grafana? Typically, I have Prometheus scaping the metrics endpoints and Grafana pulls data from Prometheus.
But because of the format, Prometheus can not read RouteDNS metrics. Wondering what was the initial idea on how to consume the metrics?

Any inputs?

There is a json_exporter for prometheus, but I am not sure if that one will be any help in the way the metrics and its json format is exposed.

@folbricht
Copy link
Owner

I'm not actually collecting the expvar metrics myself so can't tell you for sure how to do it, but I found https://github.com/albertito/prometheus-expvar-exporter which may help.

@mschirrmeister
Copy link
Author

It looked promising, but does unfortunately not work. After adding a few lines of code that it can connect over TLS, I got the following response.

2024/04/14 19:50:51 Collecting "https://192.168.2.199:8443/routedns/vars"
2024/04/14 19:50:51 Listening on ":8000"
panic: "routedns.listener.local-tcp.maxqueue" is not a valid metric name

goroutine 70 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
	/Users/marco/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/value.go:130
main.(*Collector).Collect(0xc0002645d0, 0xc000252e40)
	/Users/marco/MyData/git/public/prometheus-expvar-exporter/main.go:151 +0xab1
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
	/Users/marco/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/registry.go:458 +0xe5
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather in goroutine 66
	/Users/marco/go/pkg/mod/github.com/prometheus/client_golang@v1.14.0/prometheus/registry.go:548 +0xbab

The is not a valid metric name did vary, depending what I enabled/disabled in the config. I will try to open an issue on the project. Hope it is not abandoned.

@mschirrmeister
Copy link
Author

I got the prometheus-expvar-exporter somewhat working. But since prometheus itself has a data model which only allows certain characters in metrics names, you cannot have a dash (-) in your config names in routedns.

So for example you have to write [listeners.local_udp] instead of [listeners.local-udp]. That means you cannot have a [bootstrap-resolver] at all in your config. Which has to have this name.

For all the other things like resolvers and listeners you need add lines like this to the config.

m.routedns_cache_cache_entries.expvar = "routedns.cache.cache.entries"
m.routedns_cache_cache_hit.expvar = "routedns.cache.cache.hit"
m.routedns_cache_cache_miss.expvar = "routedns.cache.cache.miss"

m.routedns_client_cloudflare_1111_doh_drop.expvar = "routedns.client.cloudflare_1111_doh.drop"
m.routedns_client_cloudflare_1111_doh_maxqueue.expvar = "routedns.client.cloudflare_1111_doh.maxqueue"
m.routedns_client_cloudflare_1111_doh_query.expvar = "routedns.client.cloudflare_1111_doh.query"

m.routedns_listener_local_tcp_drop.expvar = "routedns.listener.local_tcp.drop"
m.routedns_listener_local_tcp_maxqueue.expvar = "routedns.listener.local_tcp.maxqueue"
m.routedns_listener_local_tcp_query.expvar = "routedns.listener.local_tcp.query"
m.routedns_listener_local_udp_drop.expvar = "routedns.listener.local_udp.drop"
m.routedns_listener_local_udp_maxqueue.expvar = "routedns.listener.local_udp.maxqueue"
m.routedns_listener_local_udp_query.expvar = "routedns.listener.local_udp.query"

But,

  • It only supports boolean and float
  • You could also define a HELP value
  • There is no way to define a TYPE, so it will show up as unknown in prometheus. Maybe not a big problem
  • RouteDNS config names should only include characters that prometheus supports
  • With every config change you would have to modify the exporter config. Which is very tedious and I think almost unmanageable

A native prometheus support would be nice. Where metrics could look like this for example.

routedns_cache_entry_count 846
routedns_cache_hit_count 238020
routedns_cache_miss_count 42801

routedns_query_total{resolver="cloudflare-1111-doh",type="A"} 3
routedns_query_total{resolver="cloudflare-1111-doh",type="AAAA"} 3

routedns_response_total{reason="RESOLVED (cloudflare-1111-doh)",response_code="NOERROR",response_type="RESOLVED"} 3
routedns_response_total{reason="RESOLVED (cloudflare-1111-doh)",response_code="NXDOMAIN",response_type="RESOLVED"} 1

I will shelf RouteDNS monitoring for now, since I have not found anything else related to expvar that suites my needs.
Feel free to close the issue.

Have you plans to add native prometheus support?

@folbricht
Copy link
Owner

While I don't have any concrete plans to add Prometheus support, it shouldn't be too hard to do. Could replace the current expvar bits with something more generic which then allows the config to specify which format should be used. The - vs _ naming issue is easy to deal with.
Let's keep this issue open for now, perhaps I'll find the time to add this in since it could be useful to a lot of people.

@folbricht folbricht changed the title Metrics format Prometheus metrics support Apr 18, 2024
@mschirrmeister
Copy link
Author

I am pretty sure it could be useful. Specifically if it could be directly Prometheus.
Right now RouteDNS has a lot of good metrics, but I have no insight because I cannot graph them.

In my case I have a bunch of DoH, DoH3, DoQ and DoT resolvers and a group with type fastest that references all of them.

With this config and even an available metric about response time, it would be possible to see which server performance how over a longer period of time.

@mschirrmeister
Copy link
Author

Alberto from https://github.com/albertito/prometheus-expvar-exporter has implemented 2 fixes. One that converts the - to a _ and an option to skip TLS certificate verification.
Since the RouteDNS admin listener runs on a secure channel and if RouteDNS is in a dev environment with no official certificate, this option comes in handy.

Here is an example of a working config. Everything with a # is converted automatically. You must specify the expvar metrics that are a map.
You could of course define the other ones as well, if you want a different naming in prometheus.

# Address to listen on. Prometheus should be told to scrape this.
listen_addr = ":8000"

# Expvar target
# RouteDNS running on Kubernetes
[server]
url = "routedns.routedns.svc.cluster.local/routedns/vars"
insecure = true


## Notes
# Everythingwith a "#" gets automatically converted. 
#
# You must specify the expvar metrics that are a map.
# For expvar lines like this:
# "routedns.listener.local-udp.response": {"NOERROR": 776, "NXDOMAIN": 19}
#
# The result will look like this:
# HELP routedns_listener_local_udp_response Various DNS response codes
# TYPE routedns_listener_local_udp_response untyped
# routedns_listener_local_udp_response{RCODE="NOERROR"} 776
# routedns_listener_local_udp_response{RCODE="NXDOMAIN"} 19

## Metrics mappings

# m.routedns_client_bootstrap_resolver_drop.expvar = "routedns.client.bootstrap-resolver.drop"
m.routedns_client_bootstrap_resolver_error.expvar = "routedns.client.bootstrap-resolver.error"
m.routedns_client_bootstrap_resolver_error.help = "Amount of dropped requests"
m.routedns_client_bootstrap_resolver_error.label_name = "REASON"
# m.routedns_client_bootstrap_resolver_maxqueue.expvar = "routedns.client.bootstrap-resolver.maxqueue"
# m.routedns_client_bootstrap_resolver_query.expvar = "routedns.client.bootstrap-resolver.query"
m.routedns_client_bootstrap_resolver_response.expvar = "routedns.client.bootstrap-resolver.response"
m.routedns_client_bootstrap_resolver_response.label_name = "RCODE"

# m.routedns_cache_cache_entries.expvar = "routedns.cache.cache.entries"
# m.routedns_cache_cache_entries.help = "routedns cache entries"
# m.routedns_cache_cache_hit.expvar = "routedns.cache.cache.hit"
# m.routedns_cache_cache_miss.expvar = "routedns.cache.cache.miss"

# m.routedns_listener_local_udp_drop.expvar = "routedns.listener.local-udp.drop"
m.routedns_listener_local_udp_error.expvar = "routedns.listener.local-udp.error"
m.routedns_listener_local_udp_error.label_name = "REASON"
# m.routedns_listener_local_udp_maxqueue.expvar = "routedns.listener.local-udp.maxqueue"
# m.routedns_listener_local_udp_query.expvar = "routedns.listener.local-udp.query"
m.routedns_listener_local_udp_response.expvar = "routedns.listener.local-udp.response"
m.routedns_listener_local_udp_response.help = "Various DNS response codes"
m.routedns_listener_local_udp_response.label_name = "RCODE"

A full example is also available here for reference. https://github.com/mschirrmeister/prometheus-expvar-exporter-docker/tree/main/examples

@folbricht folbricht linked a pull request May 4, 2024 that will close this issue
@folbricht
Copy link
Owner

Used your example and updated docs in #386

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants