-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM error when lots of information is pulled by context aware policies #716
Comments
Kubewarden 1.13.0-RC1 is out which ships with this bug fix. We have reduced the memory spike that happens when resources are fetched from the cluster fort the 1st time. Also, memory usage at rest has been drastically reduced. The key point however is that there's no silver bullet. When operating inside of a big cluster, with lots of resources (like the numbers inside of the issue's description), there's not that much we can do to reduce the initial spike. When deploying a Policy Server inside of such a cluster, the administrator must take into account this spike when calculating the resource limits (just the memory) of the Pods. According to our load tests, after the initial spike, the memory usage goes down and remains stable. This will prevent the |
@fabriziosestito can you share some numbers over there? This is going to be useful when doing the blog post |
Benchmark data10000 RoleBindings k6 load testing (
Kube-rs memory burst reductionWatching 10000 RoleBindings:
|
Also see kube-rs/kube#1494 (comment) |
The merged PR into kube should help reduce the spike by a good factor (thank you!), as should Kubernetes InitialStreamingLists once it stabilises (probably a few years before you can use that in public distribution stuff tho). In the mean time, here's a drive-by comment, you might have another optimisation path available to you now depending on how you structure things. If you use |
Closing as fixed, we've seen positive results from the tests done by us and some users with the 1.13.0-RC1 |
Is there an existing issue for this?
Current Behavior
Currently we create a kube-rs
reflector
object per each type of context-aware resource is made by a policy. This reflector keeps a copy of the Kubernetes results in memory, and keeps it in sync with the kubernetes's internal data.The more resources are pulled, the more memory is consumed by the Policy Server process. We had reports of Policy Server being killed by the kernel OOM killer because it was consuming too much memory.
That happened with the following information being pulled from Kubernetes:
Expected Behavior
The Policy Server should not be killed by the OOM killer. There should be no need to tune the memory limits of the Pod.
Steps To Reproduce
No response
Environment
* Kubewarden 1.11
Anything else?
No response
The text was updated successfully, but these errors were encountered: