-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checkpoint/Forensic container checkpointing Feature Enhancement request #114591
Comments
@jguionnet: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@jguionnet: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig node |
/sig node |
Since [1], CRIU would fail if it finds file locks being used by the application that is being checkpointed and the --file-locks option has not been specified. This pull request enables checkpointing of file locks by default. Fixes: checkpoint-restore/criu#2018 Fixes: kubernetes/kubernetes#114591 [1] checkpoint-restore/criu#1357 Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
@jguionnet thanks for opening this. I would also like to see this feature move forward. One of the main reasons for not yet having extended checkpointing to We have been discussing if encrypted images are a possible way to solve the problem of leaking secrets. But we are not sure at what level to encrypt. We think it could be implemented at the CRIU level but also on higher levels. One idea is to take the steps from https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ to convert the local checkpoint archive to an OCI image using About passing checkpoint parameters from Kubernetes to CRIU: This probably means to extend the CRI. Maybe having a generic string array to pass whatever is necessary to CRIU. Or a CRI entry for each option. Not difficult. The last we tried to add checkpoint support to the CRI took a really long time. Not sure if extending the existing functionality is easier. Moving to beta. Cannot really say how easy that is. I think we should have some way of automatically removing checkpoint archives if there are more than a certain number to not fill up the local disk with checkpoint archives. Extending the existing test cases to actually create a checkpoint now that CRI-O has the necessary support would also be a good thing to do. Until now we are only testing the checkpoint code with the expectations that CRI implementation do not implement it. Now that CRI-O implements it the test cases could be extended. I think most things you are looking for are not difficult from the implementation but there are still a few conceptual points which need to be discussed. |
maybe traige the rootfs for "passwd file", and clean from there? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What would you like to be added?
This feature is excellent; Could the following improvement be considered?
kubectl
support so we can more easily automate a solution-- We needed to pass the
file-locks
options for the feature to work for our app. See more details: How to pass the --file-locks option when checkpointing with the kubelet checkpoint-restore/criu#2018Why is this needed?
The use cases we are looking at are the following: We have monolith applications deployed on K8s. They start too slowly to enable reactive scaling and to implement scale to zero using Keda (for example). If we could checkpoint them, we could offer these options. To support these use cases, we need the above enhancements.
The text was updated successfully, but these errors were encountered: