Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-proxy generates extremely high system load #1186

Closed
exa-mk opened this issue Jul 5, 2021 · 2 comments
Closed

docker-proxy generates extremely high system load #1186

exa-mk opened this issue Jul 5, 2021 · 2 comments
Labels
Backlog Added to internal Jira backlog enhancement

Comments

@exa-mk
Copy link

exa-mk commented Jul 5, 2021

Describe the bug
With a growing monitoring host the permanent system load raises which is of course kind of normal. But lately our load raised to a permanent value around ~120 while the machine was still normally responsive.
After a hint of you consultant towards the graphing docker container we thought our cephfs export for the perfdata files was the cause which didn't turn out right - but turned our investigation in the right direction.

Suggested solution
The cause was the docker-proxy process which seems to be some userland overlay for the docker networking stuff.
Turning off this process via a new file /etc/docker/daemon.json with the content

{
            "userland-proxy": false
}

prevents docker (after a restart) to use this process while the networking is still working properly. The load stays at around ~8 now which is quite fine.

To Reproduce
Steps to reproduce the behavior:
I'm not sure how to reproduce the error, but we have a system with ~450 hosts and ~12000 services. We raised the number of gearman workers to 50 so the "Jobs waiting" get a chance to be handled - maybe this caused the problem at some point.
The openITC VM got 40 cores and 128G RAM so it should be capable enough for this kind of load.

Expected behavior
"Normal" load values.

Screenshots
n/a

Versions

  • openITCOKPIT Server Version: 4.2.1
  • Operating system: Ubuntu 20.04 LTS
@nook24 nook24 added Backlog Added to internal Jira backlog enhancement labels Jul 6, 2021
@nook24
Copy link
Member

nook24 commented Jul 9, 2021

Adding this option to the config file generator shouldn't be that hard. Unfortunately it looks like that it could cause some issues on Linux systems using an older Kernel version:
moby/moby#5618 (comment)

So this definitely requires a lot of testing on all supported platforms and architectures.
I created an internal Ticket ITC-2577 for this.

At the moment the file /etc/docker/daemon.json will not be overwritten. So you are good to go also with future updates.

We raised the number of gearman workers to 50 so the "Jobs waiting" get a chance to be handled - maybe this caused the problem at some point.

For what Queue? 50 Workers is really a lot! This may be indicates a general issue with your current setup.

nook24 added a commit that referenced this issue Nov 3, 2021
ITC-2577 #1186 Disable docker userland proxy by default
@nook24
Copy link
Member

nook24 commented Nov 3, 2021

With openITCOCKPIT 4.3.1 the Docker Userland Proxy will be disabled by default :)

@nook24 nook24 closed this as completed Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backlog Added to internal Jira backlog enhancement
Projects
None yet
Development

No branches or pull requests

2 participants