You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
With a growing monitoring host the permanent system load raises which is of course kind of normal. But lately our load raised to a permanent value around ~120 while the machine was still normally responsive.
After a hint of you consultant towards the graphing docker container we thought our cephfs export for the perfdata files was the cause which didn't turn out right - but turned our investigation in the right direction.
Suggested solution
The cause was the docker-proxy process which seems to be some userland overlay for the docker networking stuff.
Turning off this process via a new file /etc/docker/daemon.json with the content
{
"userland-proxy": false
}
prevents docker (after a restart) to use this process while the networking is still working properly. The load stays at around ~8 now which is quite fine.
To Reproduce
Steps to reproduce the behavior:
I'm not sure how to reproduce the error, but we have a system with ~450 hosts and ~12000 services. We raised the number of gearman workers to 50 so the "Jobs waiting" get a chance to be handled - maybe this caused the problem at some point.
The openITC VM got 40 cores and 128G RAM so it should be capable enough for this kind of load.
Expected behavior
"Normal" load values.
Screenshots
n/a
Versions
openITCOKPIT Server Version: 4.2.1
Operating system: Ubuntu 20.04 LTS
The text was updated successfully, but these errors were encountered:
Adding this option to the config file generator shouldn't be that hard. Unfortunately it looks like that it could cause some issues on Linux systems using an older Kernel version: moby/moby#5618 (comment)
So this definitely requires a lot of testing on all supported platforms and architectures.
I created an internal Ticket ITC-2577 for this.
At the moment the file /etc/docker/daemon.json will not be overwritten. So you are good to go also with future updates.
We raised the number of gearman workers to 50 so the "Jobs waiting" get a chance to be handled - maybe this caused the problem at some point.
For what Queue? 50 Workers is really a lot! This may be indicates a general issue with your current setup.
Describe the bug
With a growing monitoring host the permanent system load raises which is of course kind of normal. But lately our load raised to a permanent value around ~120 while the machine was still normally responsive.
After a hint of you consultant towards the graphing docker container we thought our cephfs export for the perfdata files was the cause which didn't turn out right - but turned our investigation in the right direction.
Suggested solution
The cause was the docker-proxy process which seems to be some userland overlay for the docker networking stuff.
Turning off this process via a new file /etc/docker/daemon.json with the content
prevents docker (after a restart) to use this process while the networking is still working properly. The load stays at around ~8 now which is quite fine.
To Reproduce
Steps to reproduce the behavior:
I'm not sure how to reproduce the error, but we have a system with ~450 hosts and ~12000 services. We raised the number of gearman workers to 50 so the "Jobs waiting" get a chance to be handled - maybe this caused the problem at some point.
The openITC VM got 40 cores and 128G RAM so it should be capable enough for this kind of load.
Expected behavior
"Normal" load values.
Screenshots
n/a
Versions
The text was updated successfully, but these errors were encountered: