Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageMaximumGCAge documentation recommendations #124441

Open
Joseph-Goergen opened this issue Apr 22, 2024 · 8 comments
Open

ImageMaximumGCAge documentation recommendations #124441

Joseph-Goergen opened this issue Apr 22, 2024 · 8 comments
Labels
priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Joseph-Goergen
Copy link
Contributor

related: https://kubernetes.slack.com/archives/C0BP8PW9G/p1713553524168989

The beta feature ImageMaximumGCAge that is enabled by default should have recommended ranges for people to configure. Right now out of the box it's currently set to 0s (which means it's disabled). This is a documentation request to provide a general rule of thumb for the community to shoot for with garbage collection cleaning up images.

/sig-node

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 22, 2024
@Joseph-Goergen
Copy link
Contributor Author

/sig-node

@Joseph-Goergen
Copy link
Contributor Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 22, 2024
@kannon92
Copy link
Contributor

/cc @haircommander @sohankunkerkar

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 22, 2024
@kannon92
Copy link
Contributor

Beta feedback for kubernetes/enhancements#4210

@haircommander
Copy link
Contributor

when concieving of the feature I had the idea of something intermittent but not too infrequent: something like 2 weeks. We've yet to make official recommendations as I think we need more testing time to find the frequency that seems good.

@SergeyKanzhelev
Copy link
Member

@haircommander how big of a disk usage spike will happen on cleanup?

Is 2 weeks coming from the idea that the autoscaler may decide to use the node again? If there are no autoscaler, is there any issues with 1 hour?

@SergeyKanzhelev
Copy link
Member

btw, one idea for improvement of this logic is to add some grace period to wait if there are download is hapenning at this moment. So if there is an image pull ongoing when GC want's to clean up those images, we can wait a bit to minimize disk churn

@haircommander
Copy link
Contributor

I think 1 hour is pretty low TBH

autoscaler may decide to use the node again

which autoscaler?

So if there is an image pull ongoing when GC want's to clean up those images, we can wait a bit to minimize disk churn

no matter the grace period we give, we'll hit this issue theoretically. Even if no container is using an image, we could hit situations where an unused image for 1 hour will be created immediately after.

My thought with the relatively high value is to increase the liklihood that an image is not just unused, but has been replaced by a newer version of the same image. The former case (unused for a bit, but will be used later) is an unfortunate side effect, and we'd use the new metric to track when the image was gc'd for reason age

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants