New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add queue percentage to libbeat metrics #39205
Add queue percentage to libbeat metrics #39205
Conversation
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Now that we can know the queue size in bytes, I'm just wondering if we should make it more clear that the percentage is calculated from the number of events.
I would update the integration as part of this work. If you can't for some reason, create a separate issue to do it so it isn't lost. |
The queue size limit is still configured in units of events and the metric here should match the units of the maximum size limit. |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Huh, it's possible for
This happens occasionally, in the metrics, and it's always max_queue+1. I assume it's just a result of the queue itself not being completely synced up with the metrics reporter. That, or an off-by-one. @faec you seen this before? |
Yeah, I've seen this, it's just an artifact of the metrics receiver getting notified of acks slightly after the queue unblocks so it sees the new event in flight before decrementing the old ones. I've never seen the |
Alright, added rounding. I decided to keep the percent calculations "raw" (0.0-1.0, not 0.0-100.0), since that's what we do for the system metrics percentages, and I figure we might as well keep it consistent? |
Proposed commit message
Part of #38708
This adds a
queue.full
metric that reports the percentage of queue usage in libbeat.I'm not sure if this is the correct way to measure queue usage in percentage, but this was such a small change I figured it would be faster to just put in a PR and ask, rather than ask and wait.
Testing
You can test this by starting metricbeat with
--httpprof localhost:9898
and then checking the metrics withcurl localhost:9898/debug/vars | grep libbeat
The metric will also appear in the
last 30s
metrics.Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.