High memory usage. #2637

r2k1 · 2024-03-14T03:49:39Z

I'm focused on reducing the memory use of Opencost+Prometheus.

My setup for Opencost and Prometheus is fine-tuned: it keeps data for a short time, scrapes only the essential metrics and labels needed by Opencost, and runs on many different clusters. I only ask for the past 24 hours of data and don't use caching.

I found that optimized setup of Opencost+Prometheus uses about 200MB of memory, plus an extra 0.5MB for each container.

I did a test on a cluster with 100 nodes and 25,000 pods, with some pods being replaced. Here are my results, along with some links to the pprof data:

pprof links:

I found opencost heap showing "Memory In-Use Bytes" is most insightful. It's tricky to catch heap at peak consumption and "Allocated Bytes Total" can be noisy and misleading. But I also may read it incorrectly.

In this test, memory use was about 7GB, with peaks up to 11GB, mostly when getting data from Opencost. Split half-half between opencost and prometheus.

My observations:

It's hard to tell from the Prometheus TSDB report, but I think most of the memory usage comes from storing every pod label. Different label values are aggregated in different buckets, so they're usually invisible in top10. This also influence opencost memory usage when querying.
Opencost seems to keep almost every Kubernetes object fully in memory, but only a few fields are used. The watcher and cache took up 56% of the memory, while the metrics emitter used only 4%.

I'd appreciate any ideas to lower the memory usage.

mattray · 2024-03-14T05:27:53Z

Great data! Tagging the @opencost/opencost-helm-chart-maintainers to see if anyone wants to add in anything.

AjayTripathy · 2024-03-14T20:07:53Z

Artur, thank you so much for putting this together.

Opencost seems to keep almost every Kubernetes object fully in memory, but only a few fields are used. The watcher and cache took up 56% of the memory, while the metrics emitter used only 4%.

This seems like the simplest thing to carry forward; these paths should be well-tested so if we can find a way to not store all this data in the watcher, that would be ideal.

AjayTripathy · 2024-03-15T18:20:41Z

#2641
#2642

Track the two major insights and work we'd consider doing to reduce the memory profiles based on these findings.

AjayTripathy · 2024-03-15T18:26:42Z

Also, @r2k1 is there any way we could open source how to spin up your memory testing framework so we can test any PRs for #2641 and #2641 against a consistent benchmark?

r2k1 · 2024-03-15T20:54:08Z

It would be nice to have a more automated way, but here is what I've done.

From what I've seen the usage is more or less linear with cluster size. 100 nodes cluster full of pods consumed roughly 10 times more than 10 nodes cluster.

az aks create --name artur-perf-test --resource-group artur --node-vm-size Standard_E2ps_v5 --node-count 10 --enable-managed-identity --tier standard --enable-cost-analysis  --max-pods 250 
az aks get-credentials --resource-group artur --name artur-perf-test --overwrite-existing

I think this cluster roughly costs 1$/hour.

I used kube-burner to generate some load.
There is another tool to generate load, but I found it's more difficult to use: clusterload2

Here is kube-burner script. It just fills cluster with the same pods. It churns a portion of it. In addition, you may run it multiple times.

cat << EOF > kubelet-density.yaml
# Config for 10 nodes clusters (250 pods), proportionally adjust "jobIterations" or "replicas" for other cluster sizes
---
jobs:
  - name: churning
    preLoadImages: false
    jobIterations: 50
    namespacedIterations: true
    namespace: churning
    waitWhenFinished: true
    podWait: false

    churn: true
    churnPercent: 50
    churnDuration: 1m

    objects:
      - objectTemplate: deployment.yaml
        replicas: 20
        inputVars:
          containerImage: registry.k8s.io/pause:3.1
EOF

cat <<EOF > deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kubelet-density-deployment-v3-{{.Iteration}}-{{.Replica}}
spec:
  replicas: 3
  selector:
    matchLabels:
      app: kubelet-density-{{.Iteration}}-{{.Replica}}
  template:
    metadata:
      labels:
        app: kubelet-density-{{.Iteration}}-{{.Replica}}
        label1: label-{{.Iteration}}-{{.Replica}}
        label2: label-{{.Iteration}}-{{.Replica}}
        label3: label-{{.Iteration}}-{{.Replica}}
        label4: label-{{.Iteration}}-{{.Replica}}
        label5: label-{{.Iteration}}-{{.Replica}}
        label6: label-{{.Iteration}}-{{.Replica}}
        label7: label-{{.Iteration}}-{{.Replica}}
        label8: label-{{.Iteration}}-{{.Replica}}
        label9: label-{{.Iteration}}-{{.Replica}}
        label10: label-{{.Iteration}}-{{.Replica}}
        label11: label-{{.Iteration}}-{{.Replica}}
        label12: label-{{.Iteration}}-{{.Replica}}
        label13: label-{{.Iteration}}-{{.Replica}}
        label14: label-{{.Iteration}}-{{.Replica}}
        label15: label-{{.Iteration}}-{{.Replica}}
        label16: label-{{.Iteration}}-{{.Replica}}
        label17: label-{{.Iteration}}-{{.Replica}}
        label18: label-{{.Iteration}}-{{.Replica}}
        label19: label-{{.Iteration}}-{{.Replica}}
        label20: label-{{.Iteration}}-{{.Replica}}
        label21: label-{{.Iteration}}-{{.Replica}}
        label22: label-{{.Iteration}}-{{.Replica}}
        label23: label-{{.Iteration}}-{{.Replica}}
        label24: label-{{.Iteration}}-{{.Replica}}
        label25: label-{{.Iteration}}-{{.Replica}}
        label26: label-{{.Iteration}}-{{.Replica}}
        label27: label-{{.Iteration}}-{{.Replica}}
        label28: label-{{.Iteration}}-{{.Replica}}
        label29: label-{{.Iteration}}-{{.Replica}}
        label30: label-{{.Iteration}}-{{.Replica}}
    spec:
      containers:
      - name: kubelet-density-1
        image: {{.containerImage}}
        ports:
        - containerPort: 8080
          protocol: TCP
        imagePullPolicy: IfNotPresent
        securityContext:
          privileged: false
        resources:
          requests:
            cpu: "1m"
            memory: "1Ki"
      - name: kubelet-density-2
        image: {{.containerImage}}
        ports:
          - containerPort: 8080
            protocol: TCP
        imagePullPolicy: IfNotPresent
        securityContext:
          privileged: false
        resources:
          requests:
            cpu: "1m"
            memory: "1Ki"

EOF
kube-burner init -c kubelete-density.yaml

Analyse load in prometheus:

kubectl port-forward -n kube-system deployment/cost-analysis-agent 9092 9094
open http://localhost:9092/

Here is some useful prometheus queries

# Memory usage
container_memory_working_set_bytes{pod=~"cost-analysis-agent-.*"}

# Amount of containers
count(container_cpu_usage_seconds_total)

# Total container count
count(present_over_time(container_cpu_usage_seconds_total[1d]))`
# Max container count
max_over_time(count(container_cpu_usage_seconds_total)[1d:10m])
# CPU Usage
rate(container_cpu_usage_seconds_total{pod=~"cost-analysis-agent-.*"}[5m])

Generate some load. (Note, opencost is requested indirectly)

set -e
current_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

# Get the time 24 hours ago in the required format
time_24_hours_ago=$(date -u -v-24H +"%Y-%m-%dT%H:%M:%SZ")

# Use these times in the curl command
curl "http://localhost:9094/resources/v1?from=$time_24_hours_ago&to=$current_time" > /dev/null
curl "http://localhost:9003/debug/pprof/heap" > heap-opencost.pprof
curl "http://localhost:9092/debug/pprof/heap" > heap-prometheus.pprof
curl http://localhost:9090/api/v1/status/tsdb?limit=100 > tsdb-status.json

r2k1 · 2024-03-15T21:25:33Z

If you want to test a change and iterate on it. Here is a hackish solution I used:

https://github.com/opencost/opencost/compare/develop..memtest

I usually just run memory profiler for a single test in GoLand:

AjayTripathy · 2024-03-18T22:58:06Z

Thank you for the detailed post!

cc @jessegoodier and @thomasvn -- perhaps this can form the basis for automated scale testing?

thomasvn · 2024-03-19T17:06:15Z

Yep, agree. Thanks @r2k1 for the detailed writeup here! Very helpful.

You mention that resource usage grows linearly with the number of nodes/pods. When you ran your experiment, did you also notice resource usage grew linearly with increased number of labels per pod? Or did it grow faster than linearly?

r2k1 · 2024-03-19T20:16:52Z

Sorry, I didn't measure properly how label count affects it.

thomasvn · 2024-03-19T20:32:18Z

No problem!

github-actions bot added needs-follow-up needs-triage labels Mar 14, 2024

mattray added opencost OpenCost issues vs. external/downstream P2 Estimated Priority (P0 is highest, P4 is lowest) kubecost Relevant to Kubecost's downstream project E3 Estimated level of Effort (1 is easiest, 4 is hardest) labels Mar 14, 2024

This was referenced Mar 15, 2024

Create whitelist for all labels not used for controller or service selection to improve query performance and memory usage. #2641

Open

Modify k8s API caches and watchers to only store relevant projections of pod #2642

Open

r2k1 mentioned this issue May 1, 2024

Reduce memory consumption #2725

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory usage. #2637

High memory usage. #2637

r2k1 commented Mar 14, 2024 •

edited

mattray commented Mar 14, 2024

AjayTripathy commented Mar 14, 2024

AjayTripathy commented Mar 15, 2024

AjayTripathy commented Mar 15, 2024

r2k1 commented Mar 15, 2024

r2k1 commented Mar 15, 2024

AjayTripathy commented Mar 18, 2024

thomasvn commented Mar 19, 2024

r2k1 commented Mar 19, 2024

thomasvn commented Mar 19, 2024

High memory usage. #2637

High memory usage. #2637

Comments

r2k1 commented Mar 14, 2024 • edited

mattray commented Mar 14, 2024

AjayTripathy commented Mar 14, 2024

AjayTripathy commented Mar 15, 2024

AjayTripathy commented Mar 15, 2024

r2k1 commented Mar 15, 2024

r2k1 commented Mar 15, 2024

AjayTripathy commented Mar 18, 2024

thomasvn commented Mar 19, 2024

r2k1 commented Mar 19, 2024

thomasvn commented Mar 19, 2024

r2k1 commented Mar 14, 2024 •

edited