-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes cache optimization #2736
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@opencost/opencost-maintainers any thoughts on it? Especially @mbolt35 and @AjayTripathy |
Thanks @r2k1 -- at a glance, this looks great. Totally supportive. If Bolt and Ajay aren't able to review this week, I'll loop back and give an official review ASAP. |
pkg/clustercache/clustercache.go
Outdated
pvWatch: NewCachingWatcher(coreRestClient, "persistentvolumes", &v1.PersistentVolume{}, "", fields.Everything()), | ||
pvcWatch: NewCachingWatcher(coreRestClient, "persistentvolumeclaims", &v1.PersistentVolumeClaim{}, "", fields.Everything()), | ||
storageClassWatch: NewCachingWatcher(storageRestClient, "storageclasses", &stv1.StorageClass{}, "", fields.Everything()), | ||
jobsWatch: NewCachingWatcher(batchClient, "jobs", &batchv1.Job{}, "", fields.Everything()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that everything that is being monitored here is done so for one reason or another, not limited to just opencost scope. If we're going to remove these, we would really want to make sure that everything downstream is updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way I'm now able to prepare reduced structure if I don't know what fields are used.
I check the usage and fix compilation errors; in this case I would just create empty structs.
It's unfortunate that kubecost requirements adds extra footprint on opencost users.
I guess it will require someone from kubecost to mark all the used fields for all k8s objects
pkg/cmd/agent/agent.go
Outdated
// Initialize cluster exporting if it's enabled | ||
if env.IsExportClusterCacheEnabled() { | ||
cacheLocation := confManager.ConfigFileAt(path.Join(configPrefix, "cluster-cache.json")) | ||
clusterExporter = clustercache.NewClusterExporter(clusterCache, cacheLocation, ClusterExportInterval) | ||
clusterExporter.Run() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure what to do with this part yet. For now, I’ve just turned it off. This way, I can try other things without having to change too much at the start.
This change will affect the format of data is saved and loaded, which might cause problems. I need to understand better how it’s used before I can decide what to do next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this kind of feature removal be behind a flag at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll put it back in the final version.
But it will likely fail to load any existing files.
Can you provide any extra information on how it's used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's used by a few upstream dependencies that have a multicluster implementation. We can check if its necesary but we should probably issue a deprecation notice even if we can't find anything that uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it back
pkg/env/kubemetricsenv.go
Outdated
func IsExportClusterCacheEnabled() bool { | ||
return env.GetBool(ExportClusterCacheEnabledEnvVar, false) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restored
I think there was a package watcher
import (
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/rest"
"k8s.io/client-go/tools/cache"
"k8s.io/klog"
)
// ResourceHandler is a function alias for a receiver of an update or remove.
type ResourceHandler[T any] func(*T)
// ResourceWatcher is a controller which constructs a lightweight store implementation for add/update/remove
// passthroughs via a kubernetes Reflector.
type ResourceWatcher[T any] struct {
listWatch *cache.ListWatch
reflector *cache.Reflector
}
// KubernetesResource is a generic constraint that ensures the T passed into NewWatcher will implement kubernetes runtime.Object
// and pointer receiver for the T instance.
type KubernetesResource[T any] interface {
runtime.Object
*T
}
// NewWatcher creates a new ResourceWatcher[T] instance which calls the provided handlers on add, update, and removal.
func NewResourceWatcher[T any, U KubernetesResource[T]](
restClient rest.Interface,
resource string,
namespace string,
fieldSelector fields.Selector,
updateHandler ResourceHandler[T],
removeHandler ResourceHandler[T],
) *ResourceWatcher[T] {
// we construct a lightweight store proxy which doesn't actually store
// resources. instead, it passes lists, adds, updates, and removals
// through to the handlers
proxy := &storeProxy[T]{
onUpdate: updateHandler,
onRemoved: removeHandler,
}
resourceType := new(T)
listWatch := cache.NewListWatchFromClient(restClient, resource, namespace, fieldSelector)
reflector := cache.NewReflector(listWatch, resourceType, proxy, 0)
return &ResourceWatcher[T]{
listWatch: listWatch,
reflector: reflector,
}
}
// Run starts the underlying kubernetes reflector and begins receiving watch updates
func (w *ResourceWatcher[T]) Run(stopCh <-chan struct{}) {
w.reflector.Run(stopCh)
}
//--------------------------------------------------------------------------
// store proxy
//--------------------------------------------------------------------------
// The kubernetes Reflector implementation only leverages 4 methods on the
// Store contract: Add, Update, Delete, and Replace. Since the point of this
// optimization is to completely avoid caching the full resource objects, we
// provide no-ops for the remaining methods: List, ListKeys, Get, GetByKey,
// and Resync.
//
// Unfortunately, without recreating the Reflector using a lighter interface,
// there's not another way to avoid this. The side-effect is that at any time
// the Reflector implementation could change and begin using the no-op methods.
// It would be unlikely, but possible and worth noting here
//--------------------------------------------------------------------------
// storeProxy is a write-only storage mechanism used to inject passthrough behavior
// to the kubernetes Reflector utility for resource watching.
type storeProxy[T any] struct {
onUpdate ResourceHandler[T]
onRemoved ResourceHandler[T]
}
// Add tests to ensure the object is the correct type and passes it to the update method.
func (ps *storeProxy[T]) Add(obj interface{}) error {
if resource, ok := obj.(*T); ok {
ps.onUpdate(resource)
}
return nil
}
// Update tests to ensure the object is the correct type and passes it to the update method.
func (ps *storeProxy[T]) Update(obj interface{}) error {
if resource, ok := obj.(*T); ok {
ps.onUpdate(resource)
}
return nil
}
// Delete tests to ensure the object is the correct type and passes it to the remove method.
func (ps *storeProxy[T]) Delete(obj interface{}) error {
if resource, ok := obj.(*T); ok {
ps.onRemoved(resource)
}
return nil
}
// Replace occurs on initial listing of the resource, so we simply iterate the replacement items
// and call add.
func (ps *storeProxy[T]) Replace(items []interface{}, resourceVersion string) error {
for _, item := range items {
ps.Add(item)
}
return nil
}
// No-Op
func (ps *storeProxy[T]) List() []interface{} {
klog.Warning("Unexpected call: List() on resource store proxy")
return nil
}
// No-Op
func (ps *storeProxy[T]) ListKeys() []string {
klog.Warning("Unexpected call: ListKeys() on resource store proxy")
return nil
}
// No-Op
func (ps *storeProxy[T]) Get(obj interface{}) (item interface{}, exists bool, err error) {
klog.Warning("Unexpected call: Get() on resource store proxy")
return nil, false, nil
}
// No-Op
func (ps *storeProxy[T]) GetByKey(key string) (item interface{}, exists bool, err error) {
klog.Warning("Unexpected call: GetByKey() on resource store proxy")
return nil, false, nil
}
// No-Op
func (ps *storeProxy[T]) Resync() error {
klog.Warning("Unexpected call: Resync() on resource store proxy")
return nil
} |
Overall, removing so much functionality concerns me, as opencost is a public package that has dependents. Since we're now also providing our own flyweight model for the resources, any dependents are now limited to opencost requirements, not kubernetes. Since we're not providing any way to publically extend the watcher list or provide custom model mapping, I totally support the notation of reducing the footprint here, as the watchers and the resource objects themselves can be quite taxing on cpu and memory. Thanks for drafting this up -- would love to work with you to figure out how we can make this work. Give me some time to survey some of our dependent code's use-cases and see if I can't propose something that would push this through without being as severe of a breaking change. Thanks again :) |
The |
Is there any way to switch between the two implementations, possiby with an interface? That way we wouldn't have to worry about deprecating potentially used fields. Won't block this PR on that though given we can also review dependencies on our side. |
I don't see a reasonable implementation hidden behind an interface. However, depending on your use of interface{}, I would expect most potential issues to be caught during compilation time (excluding serialization/deserialization) |
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Quality Gate passedIssues Measures |
I run some tests on a synthetic workload with 10 nodes X 250 pods each: Before the change: After the change: Subtraction of 40MiB of static load (part that doesn't scale with cluster size): Before the change: After the change: This is synthetic workload, with bare minimum object fields populated. Memory consumption is reduced in half. |
I'll need more help to find if there are any additional fields that's being used outside of opencost. |
Summary
Currently, most Kubernetes objects with all their fields are stored in memory. Copies are also made on some requests, which creates multiple in-memory duplicates of etcd storage. This particularly troublesome for large clusters, requiring large amount of memory.
This update aims to optimize the memory footprint by retaining only the fields actively used in the codebase. This is achieved by replacing the Informer with the Reflector and caching only the transformed data instead of the entire original object.
The core changes are located in pkg/clustercache.
Testing and Results
Estimating memory usage can be challenging. To measure the impact of this change, I ran
opencost
locally while connected to a remote Kubernetes cluster, triggered requests to the/metrics
endpoints, and recorded memory consumption. (Temporary changes incmd/costmodel/main.go
were used for testing purposes).Test results from a simple cluster with a basic synthetic workload:
Baseline (develop branch):
lastHeapAlloc=87.7MiB
,maxHeapAlloc=122.1MiB
With Optimization:
lastHeapAlloc=56.0MiB
,maxHeapAlloc=82.7MiB
Analysis:
83-43=40
MiB vs.122-43=79
MiB).56-43=13
MiB vs.88-43=45
MiB).Implications:
Additional Testing
A diff comparison between the original and optimized
/metrics
responses did not reveal any significant discrepancies, but further testing is requiredBreaking changes.
Here are the changes that might break compatibility: