Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to shorten the collecting interval(resolution) ? #1483

Open
Alex-Kil opened this issue May 2, 2024 · 3 comments
Open

How to shorten the collecting interval(resolution) ? #1483

Alex-Kil opened this issue May 2, 2024 · 3 comments
Assignees
Labels
kind/support Categorizes issue or PR as a support question. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Alex-Kil
Copy link

Alex-Kil commented May 2, 2024

Hi,

FAQ.md says that minumum metric-resolution calculated by Kubelet is 15s.
And Metrics server source code is like below:

func (o Options) validate() []error {
errors := []error{}
if o.MetricResolution < 10
time.Second {
errors = append(errors, fmt.Errorf("metric-resolution should be a time duration at least 10s, but value %v provided", o.MetricResolution))
}
if o.MetricResolution*9/10 < o.KubeletClient.KubeletRequestTimeout {
errors = append(errors, fmt.Errorf("metric-resolution should be larger than kubelet-request-timeout, but metric-resolution value %v kubelet-request-timeout value %v provided", o.MetricResolution, o.KubeletClient.KubeletRequestTimeout))
}
return errors
}

I want to shorten the resolution interval so that I can catch the min/max of CPU/Memory usage per pod because the resource usages are fluctuating very fast so 15s resolution maybe miss the peak point.
Do I have to collect /metrics/resources directly from endpoint ? Or is there any other solution ?

Thanks,
Alex

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 2, 2024
@logicalhan
Copy link
Contributor

/kind support
/triage accepted
/assign @dgrisonnet

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 2, 2024
@Alex-Kil
Copy link
Author

Alex-Kil commented May 7, 2024

Hi Team,
Any update on this ?

@dgrisonnet
Copy link
Member

It is not recommended to go below 15s as this would put too much pressure on kubelet who's metrics collection doesn't scale well. There is a longstanding issue about that in Kubernetes kubernetes/kubernetes#104459 but we haven't made much progress on that and it doesn't look like there will be any in the short term.

One project that was written to work around that is https://github.com/kubernetes-sigs/usage-metrics-collector/, but it doesn't support cgroups v2 today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants