Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any examples of using CMAK with Prometheus + Grafana to monitor consumer lag? #894

Open
rja1 opened this issue Nov 3, 2022 · 8 comments

Comments

@rja1
Copy link

rja1 commented Nov 3, 2022

I'd like to feed this data into Grafana: https://some_server.com/api/status/some_cluster/consumersSummary.

Anyone have experience/examples of doing this with CMAK?

Thanks

@one-two-my-gad
Copy link

你可以尝试通过,消息挤压来推算

@janengelmohr
Copy link

Hey @rja1, I am a little late here maybe but why don't you just use the JMX-Export directly from Kafka to feed it into your prometheus/grafana?

@rja1
Copy link
Author

rja1 commented Dec 21, 2022

Good question @engelmohr, but jmx-exporter doesn't provide consumer offset data. I would have just used Kafka-exporter, but it doesn't support SASL_PLAINTEXT, which is what our clusters use for auth. I could have hit the kafka-ui api. It's not ideal though, because you have to make an api call for each consumer. Additionally, it can't connect to one of our legacy 0.10.2.0 clusters (unsupported version).

In the end, I wrote a python script to hit the CMAK api in a single call, store the offset data into a mysql database, where it can be mapped as a datasource in Grafana. Works great, but feels like a little bit like a hack

@OneCricketeer
Copy link

OneCricketeer commented Mar 22, 2023

Kafka-exporter, but it doesn't support SASL_PLAINTEXT

JMX Exporter on the consumer clients is what you would need for lag. This is external from any auth settings.

You could also use Burrow to monitor lag for non-JVM clients.

@rja1
Copy link
Author

rja1 commented Mar 22, 2023

Thanks @OneCricketeer. As I recall, JMX doesn't expose lag data and Burrow doesn't support SASL_PLAINTEXT

@OneCricketeer
Copy link

OneCricketeer commented Mar 22, 2023

Consumer JMX does have lag; not the broker/producer JMX.

I don't use Burrow, but I'd be very surprised if they did not... It uses Sarama, which does support it, to read directly from the offsets topic

@rja1
Copy link
Author

rja1 commented Mar 22, 2023

Gotcha. I actually ended up just writing a Python hack to slap the CMAK api, pulling the lag data by group and persist it to a mysql backend. I then tied it into grafana. Works great, but it's a little kludgy. Anyway, thanks for your replies. I'll check out Burrow again for fun

@OneCricketeer
Copy link

I took a look at Burrow myself, and seems there is an open PR for SASL PLAINTEXT, so you were right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants