Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create whitelist for all labels not used for controller or service selection to improve query performance and memory usage. #2641

Open
AjayTripathy opened this issue Mar 15, 2024 · 5 comments
Labels
E3 Estimated level of Effort (1 is easiest, 4 is hardest) kubecost Relevant to Kubecost's downstream project needs-follow-up opencost OpenCost issues vs. external/downstream P2 Estimated Priority (P0 is highest, P4 is lowest)

Comments

@AjayTripathy
Copy link
Contributor

AjayTripathy commented Mar 15, 2024

Is your feature request related to a problem? Please describe.
See #2637 . Labels can be a considerable portion of opencost and the backing prometheus memory footprint.

Describe the solution you'd like

  • We can know a priori via the kubernetes API which labels are important for selection into a controller/service by looking up the selector lables, and automatically emit those
  • The end user can add meaningful other labels on a whitelist.

Estimated impact: I'm guessing this would reduce the size of the stored data in prometheus by 30% or so, so likely a 30% speedup in query time and query memory usage.

Describe alternatives you've considered
There may be other improvements about what to store in memory from the k8s api; but that work won't impact prometheus memory.

@r2k1
Copy link
Contributor

r2k1 commented Mar 15, 2024

I think alternatively a new metric (or multiple metrics) exposing relationship between pod/controller, pod/service can be added.
It will further reduce the memory footprint and may simplify aggregation logic.

@AjayTripathy
Copy link
Contributor Author

It's been awhile since I've been in that code, but keeping a subset of labels feels simpler than creating new ones and about the same order of magnitude, plus a whitelist will allow other labels important to the user to still be kept?

@r2k1
Copy link
Contributor

r2k1 commented Mar 19, 2024

Labels serve two main purposes:

  • They group pods under controllers.
  • They help allocate costs based on labels.

A solution for one may not be ideal for another.

I can't dictate the labels users choose for their deployments or services, leading to a vast variety of labels. Each deployment might use a unique set of these labels.

Exporting varying labels for different pods could lead to inaccuracies when such labels are used for aggregation.

@r2k1
Copy link
Contributor

r2k1 commented Mar 19, 2024

Overall, this solution may fix my issue, but need to be careful allowing users to use all labels for aggregation purposes.

@mattray mattray added opencost OpenCost issues vs. external/downstream P2 Estimated Priority (P0 is highest, P4 is lowest) kubecost Relevant to Kubecost's downstream project E3 Estimated level of Effort (1 is easiest, 4 is hardest) and removed needs-triage needs-follow-up labels Apr 1, 2024
@AjayTripathy
Copy link
Contributor Author

#2719 takes a quick pass at this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E3 Estimated level of Effort (1 is easiest, 4 is hardest) kubecost Relevant to Kubecost's downstream project needs-follow-up opencost OpenCost issues vs. external/downstream P2 Estimated Priority (P0 is highest, P4 is lowest)
Projects
None yet
Development

No branches or pull requests

3 participants