Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker stats API response not matching with actual usage in container #8640

Open
3 tasks
srik65 opened this issue Jul 2, 2020 · 7 comments
Open
3 tasks
Assignees

Comments

@srik65
Copy link

srik65 commented Jul 2, 2020

Summary

I'm trying to calculate CPU usage from docker stats response json. We observed that "system_cpu_usage" from "cpu_stats" and "precpu_stats" are always same in every response from VCH and it is giving invalid CPU usage details upon using CPU calculation.

vmstat/top is showing 100% and stats API response gave 0.36% upon calculation, even admiral showing 0.36% as utilization at same time. Would like to check if it is known issue or any mis-configuration on VCH causing this issue. Could you give us some pointers to debug more on this.

FYI, we used CPU calculation from here.,
https://github.com/docker/docker/blob/eb131c5383db8cac633919f82abad86c99bffbe5/cli/command/container/stats_helpers.go#L175

Environment information

VMware Product: VMware vCenter Server
VMware OS: linux-x64
VMware OS version: 6.7.0
Server: 1.5.4

See also

Troubleshooting attempted

  • Searched GitHub for existing issues. (Mention any similar issues under "See also", above.)
  • Searched the documentation for relevant troubleshooting guidance.
  • Searched for a relevant VMware KB article.
@renmaosheng
Copy link
Contributor

Hi Wending,
Could you please take a look? thanks.

@wjun
Copy link
Contributor

wjun commented Aug 14, 2020

VIC relies on reading VM sample from vCenter performance apis(

func (vmc *VMCollector) sample(op trace.Operation, mos []types.ManagedObjectReference) {
) and change it to container stats(
currentMetric *performance.VMMetrics
) . This design is correct since each container is a VM. @ading007 you may check if you can reproduce locally and if so, and check
func (cs *ContainerStats) ToContainerStats(current *performance.VMMetrics) (*types.StatsJSON, error) {
and see if any calculation formula used there is incorrect.

@srik65
Copy link
Author

srik65 commented Sep 1, 2020

And to add to issue Summary.,
we were able to see accurate metrics for containers in LPAR2RRD dashboard.

@aviratna
Copy link

aviratna commented Sep 3, 2020

@wjun

Can you please let us know what is API to get the above stats for container. We tried using docker stats API get to container CPU, Memory metrics however stats doesnt match with actual container usage.

@cmrajiv

@arslanabbasi
Copy link

@aviratna As containers are VMs in VIC, the behavior for CPU percentages is different. Can you please post the results you are getting from docker stats vs what the expected results should be? An example would help and elaborating on the use-case will give more context to the team. This will help in bringing the gap in understanding of the responses and if they need to be changed in future.

@ading007
Copy link

ading007 commented Sep 10, 2020

This is by design, because each VCH is a virtual docker host, and each container VM's cpu usage is the percentage of the total CPU MHZ allocated to the corresponding VCH resource pool. The calculation formula of cpu percentage displayed by docker stats is: float64(container vm cpu usage)/float64(VCH CPU limit) *(container vm vcpu number).

How to view VCH CPU limit?

image

@aviratna
Copy link

@ading007 @arslanabbasi @wjun : Thanks for response.

I think above formula of CPU% against total CPU resources of boundary is more applicable in nested container scenario where containers are running inside VM. While in case of VIC as every container is VM in backend CPU% should be actual actual CPU utilization of that container/VM.

If we apply same formula of traditional docker for VIC Container CPU utilization value will not be correct.
e.g. If container CPU utilization is reaching 100%, in docker stats will not show 100% as total is VCH resource pool limit.
Business rules which are written to capture this stats and send alert or take any decision to scale the container will get failed.

Base on code snippet by @wjun it looks like VCH is calling vCenter API to get the actual stats for VM however when we call stats API it doesn't give actual utilization of that container (VM).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants