-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align connects metrics to prevent no data scenarios #670
Comments
I think adding attributes technically changes things? I don't know how monitoring otel works. Does this same problem exist for write_errs and read_errs? I think rather than change anything by default, another option could be introduced, // WithMergedConnectsMeter merges the `messaging.kafka.connect_errors.count`
// meter into the `messaging.kafka.connect_errors.count` meter, adding an attribute
// "outcome" with the values "success" or "failure". This option may be used because
// ...
func WithMergedConnectsMeter() MeterOpt |
I'm not sure I understand but let me try to clarify: a metric can have multiple attribute but would still be considered a single metric (source: OpenTelemetry Protocol data model).
I would say no because:
That would be fine for me and I can submit a PR so we can discuss the implementation. Let me know how would you prefer to move forward and thank you for looking into this. |
I'm not a big fan of v2, especially for only one small change. +1 to a PR |
Hello, while using this library and the
kotel
plugin for monitoring some consumers of Kafka clusters we noticed a pattern in the connection metrics that is creating some minor issues that we'd like to address.We have an alert that monitors the value of
messaging.kafka.connect_errors.count
to inform us in case the number is out of the ordinary. We also want to alert in case this metric stop being sent by the consumers, as that means we have a blind spot in our monitoring that we need to address.At the moment this metric is emitted only when connection errors happens, while a sibling metric in only emitted when the connection succeed.
franz-go/plugin/kotel/meter.go
Lines 233 to 249 in a2d69ce
Having 2 metrics means that both stop sending metrics at some point for an undefined period of time.
This is mostly not an issues for
connects.count
, as for working clusters it is emitted bykotel
most of the time, but leaves wide gaps forconnect_errors.count
. From the monitoring tool, is impossible to establish during this gap if the metric stopped being received because there are no errors or because the instrumentation stopped working.A possible solution for this issue but maintain the semantic, is to rely on a single metric with different attribute. For example, using
messaging.kafka.connects.count
with anoutcome
attribute with 2 values,success
orfailure
.Would you be open to accept a PR that changes the monitor to rely on
messaging.kafka.connects.count
as single metric with different attributes?Thank you for reading this issue and for this library!
The text was updated successfully, but these errors were encountered: