Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent and analyzer taking lots of cpu cycles and memory #2321

Open
ashishth09 opened this issue Jan 7, 2021 · 7 comments
Open

Agent and analyzer taking lots of cpu cycles and memory #2321

ashishth09 opened this issue Jan 7, 2021 · 7 comments

Comments

@ashishth09
Copy link

ashishth09 commented Jan 7, 2021

  insight                    netflow-collector-skydive-agent-b26tm                     100m (2%)     2 (51%)      512Mi (3%)       8Gi (61%)      27d
  insight                    netflow-collector-skydive-analyzer-7497cd8b79-5gz8v       100m (2%)     2 (51%)      512Mi (3%)       8Gi (61%)      27d
  insight                    skydive-operator-67c958f454-4rxl6                         0 (0%)        0 (0%)       0 (0%)           0 (0%)         27d

We can see from the above snapshot that the agent and analyzer is taking 2 cores (and has exceeded the limit we put on it) with requests as 100m. What is the reason behind such a huge cpu usage. In fact memory consumption is also very high. 61% of the assigned 8GB limit. We used https://github.com/skydive-project/skydive-operator to install the operator on IBM Cloud classic infra kubernetes service.

Kubernetes version is v1.16.14

@eranra
Copy link
Contributor

eranra commented Jan 7, 2021

Hi @ashishth09 can you explain more on the environment :: how many hosts + pods ? and what workload is running in that environment ?

Also::
(1) I see 27 days ... is that high cpu new (after long time) or after reboot things are ok ?
(2) 100m to the best of my knowledge is 0.1% of a core. can you paste the command that you used to get that output ?

@ajaysikdar
Copy link

@eranra
We are getting the above information, will share it once we have it.
we used kubectl top cmd.

@ajaysikdar
Copy link

@eranra

Skydive was installed on 20+ clusters. it was after installation that it is showing high cpu and reboot didn't work. we had to uninstall it completely.

@eranra
Copy link
Contributor

eranra commented Jan 8, 2021

@ashishth09

  • How many nodes you had on each cluster and how many pods on each node also how many flows per sec. I am trying to understand the amount of traffic in your systems to understand if this is connected. On typical machines captures are not very extensive but if you configure multiple captures (on many interfaces) and have a lot of traffic it might cause the CPU to go high

There are multiple alternatives, some are to reduce the number of in parallel captures , some are to move to more efficient captures like ebpf ( http://skydive.network/blog/skydive-with-ebpf.html) but before going there, maybe you should check things like that 100m is that really 2 cpus that skydive consumes or much lower than that?

@ajaysikdar
Copy link

@eranra ,

Do you know, how we can verify 100m is that really 2 cpus that skydive consumes or much lower than that?

@eranra
Copy link
Contributor

eranra commented Jan 11, 2021

@ashishth09 easiest way, if you can connect to the k8s node (the VM) is to execute something like ps aux --sort=-pcpu | head -n 10 this will show top 10 cpu consuming processes on the node ... make sure you connect to the correct node running skydive analyzer and agents. Another option is to increase the limit in the deployment and deamon-set and see if skydive works better. For example change to 1000m (== 1 virtual CPU) ... wait for new pods to be deployed and check that the limit really changed

@eranra
Copy link
Contributor

eranra commented Feb 1, 2021

@ashishth09 is this still relevant ???? if not please close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants