Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set GOMAXPROCS and GOMEMLIMIT environment variables #6977

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jnoordsij
Copy link

@jnoordsij jnoordsij commented Apr 30, 2024

Pull Request Motivation

Set GOMAXPROCS and GOMEMLIMIT environment variables based on container resources.

Inspired by traefik/traefik-helm-chart#1029, this should reduce potential CPU throttling and OOMKills on containers.

Kind

/kind feature

Release Note

Autoset `GOMAXPROCS` and `GOMEMLIMIT` environment variables for cert-manager pods based on requested CPU and memory values

…r resources

Signed-off-by: Jesper Noordsij <jesper@sslleiden.nl>
@cert-manager-prow cert-manager-prow bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. area/deploy Indicates a PR modifies deployment configuration labels Apr 30, 2024
@cert-manager-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joshvanl for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cert-manager-prow cert-manager-prow bot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 30, 2024
@cert-manager-prow
Copy link
Contributor

Hi @jnoordsij. Thanks for your PR.

I'm waiting for a cert-manager member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cert-manager-prow cert-manager-prow bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 30, 2024
@inteon
Copy link
Member

inteon commented May 1, 2024

@wallrj FYI

@wallrj wallrj self-requested a review May 9, 2024 05:55
Copy link
Member

@SgtCoDFish SgtCoDFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, appreciate you getting involved with the project!

I don't think we can merge this as is, and I think to merge something like this we'd need to discuss further about how to do this safely and in a way that won't lead to confusing outcomes. I'd encourage you to attend one of our meetings if you want to discuss this!

Comment on lines +166 to +177
{{- if (.Values.resources.limits).cpu }}
- name: GOMAXPROCS
valueFrom:
resourceFieldRef:
resource: limits.cpu
{{- end }}
{{- if (.Values.resources.limits).memory }}
- name: GOMEMLIMIT
valueFrom:
resourceFieldRef:
resource: limits.memory
{{- end }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: A valid value for limits.cpu would be 100m. That value wouldn't be valid for GOMAXPROCS as far as I can tell, and I think it'd be ignored.

Similarly, a valid amount of memory in limits.memory would be 1e6 or 120M, and both of those would be invalid entries for GOMEMLIMIT (this says that supported suffixes are "B, KiB, MiB, GiB, and TiB").

I don't think we can generally apply the values of resource limits like this. This won't have the expected effect for a lot of totally valid resource limits.

It's maybe possible to construct some Helm function to convert resource limits to GOMAXPROCS / GOMEMLIMIT, but I think that'd be hard to do and a pain to debug. My instinct is that it's probably not worth the effort to add this - what do you think?

Copy link
Member

@inteon inteon May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might actually work: https://billglover.me/2022/09/14/use-the-kubernetes-downwards-api-to-set-gomemlimit/
That article seems to suggest that the downwards API actually returns the computed int value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this PR needs confirmation, tests and documentation of that 😁 That article might be worth adding to the PR description but an article isn't really enough to get this change over the line.

Specifically, I'd like to see examples of what GOMAXPROCS is set to if my CPU limits are 0.01, 1m or 2.5 as initial test cases. I'd also like to see what GOMEMLIMIT is set to if I specify 1e6, or 1KB as memory limits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resourceFieldRef is a very specific Kubernetes directive that is created specifically for passing resource-related values, which rounds up the CPU value to the nearest whole number (e.g. 250m to 1) and passes the memory as a numeric value; so 64Mi would result in the environment variable being set to 67108864. This by design makes it completely compatible with Go's API.

An example is documented within Kubernetes documentation itself: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#use-container-fields-as-values-for-environment-variables.

Would referencing to Kubernetes documentation suffice here, given that this just basically ensures the correct behavior by design? And if so, what would be a suitable please to add this reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the link!

Honestly, having the docs linked in this PR is probably enough. I don't think we need to complicate the helm chart with a link, and anyone that cares will be able to git blame and find this PR easily enough.

I'll enable testing for this PR 👌

@SgtCoDFish
Copy link
Member

/ok-to-test

@cert-manager-prow cert-manager-prow bot added ok-to-test and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 14, 2024
@SgtCoDFish SgtCoDFish added kind/feature Categorizes issue or PR as related to a new feature. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels May 14, 2024
@SgtCoDFish SgtCoDFish added this to the 1.15 milestone May 14, 2024
@inteon inteon modified the milestones: 1.15, 1.16 May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/deploy Indicates a PR modifies deployment configuration dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants