Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate tokio-metrics. #57

Open
zamu-flowerpot opened this issue Mar 29, 2023 · 5 comments
Open

Incorporate tokio-metrics. #57

zamu-flowerpot opened this issue Mar 29, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@zamu-flowerpot
Copy link

Feature Request: Incorporate tokio-metrics into the exported metrics.

Specifically exposing the RuntimeMetrics automatically seems like low hanging fruit that most people who are monitoring their stack would like.

This is done in other runtimes already with most of go prometheus exporters doing this and with python exporters at least exporting the memory usage.

@emschwartz
Copy link
Contributor

Hi @zamu-flowerpot, thanks for the feature request!

This is an interesting idea. I agree that automatically exporting some runtime metrics, or at least having the option to do so with a feature flag sounds useful.

The main challenge I see is figuring out the naming of the metrics, since there doesn't seem to be a standard way of translating those metrics to Prometheus metrics. Are there any specific metrics you're especially interested in? We could potentially start by adding some subset that you and maybe others would find useful.

@emschwartz emschwartz added the enhancement New feature or request label Mar 29, 2023
@zamu-flowerpot
Copy link
Author

Hi @emschwartz!

Looking at my prometheus instance and the metrics go exports, they export over 30 and they just namespace the metric. For example, go_gc_duration_seconds and go_sched_latencies_seconds_bucket.

I could see something similar for the tokio runtime metrics since they are already in a similar format, we would just need to prepend them with tokio_ or tokio_rt_. If you take a look at tokio_metrics::RuntimeMetrics and expand the struct's members you can see that all of them are already 90% of the way there. Each name seems to translate pretty well to prometheus types (mostly, the _durations are counters and everything else is a gauge).

I haven't really spent a lot of time delving into autometrics beyond the usage to really comment on how best to integrate it.

As for which of the RunetimeMetrics metrics, all of them would be nice 🤣, but feel free to select whichever you think would be the best/easiest to test. I don't have a pressing need for this for dev as I can just spin up tokio-console, but it would be nice to collect when things are deployed beyond dev.

@emschwartz
Copy link
Contributor

Make sense.

I'm thinking we could:

  • Add a feature flag tokio-metrics that would enable this behavior
  • The metrics are automatically tracked and exported along with the function-level metrics autometrics produces, using whichever underlying metrics crate you're using
  • To get a bit of the autometrics magic where we write queries for you, we could potentially add a dummy macro or function that you could hover over that would include links to queries for the tokio runtime metrics. (Not entirely sure what the best approach here would be but we'd want to pick up the PROMETHEUS_URL environment variable to construct the right links to the graphs)

@zamu-flowerpot
Copy link
Author

To get a bit of the autometrics magic where we write queries for you, we could potentially add a dummy macro or function that you could hover over that would include links to queries for the tokio runtime metrics. (Not entirely sure what the best approach here would be but we'd want to pick up the PROMETHEUS_URL environment variable to construct the right links to the graphs)

Would it be possible to have it rewrite the autometrics::global_metrics_exporter tooltip? As far as I am aware, it's only used when utilizing the prometheus-exporter feature flag, so it seems like a good place to reference the tokio-metrics. That or a function to customize the tokio-metrics prefix and/or the labels attached to them and replace that tooltip.

@gagbo
Copy link
Member

gagbo commented Mar 31, 2023

The first part (about accessing the metrics from tokio in prometheus) could be implemented as an extension to the rust-prometheus client library we use. To avoid adding maintenance burden on unstable features on that project, I made a small collector library there https://github.com/gagbo/rust-prometheus-tokio Haven't fully tested yet, but I'm optimistic about being able to create a small standalone example in that crate and fix the remaining issues over next week. At least a pass over the names is going to be necessary.

Similar integrations could be added for OTel metrics by either us, the OTel project, or the Tokio project, let's focus on Prometheus for this phase.

The second part (about autometrics integration), is a little trickier. I haven't had time to look too much into the rust implementation, but finding where "global" metric info such as tokio runtime should be written is the first thing.

The second trick is about how to detect when there's a tokio runtime to attach to, in order to add the metrics collector to the registry autometrics uses.

The third trick is about finding a way to communicate to users that (for now) the tokio_unstable rustflag is necessary (probably a compile_error macro on detecting the autometrics/tokio feature without the cfg for tokio-unstable, not 100% sure now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants