Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Confidential Containers - Skip pullimage for runtimes that are handling it #8008

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

littlejawa
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

With #7471 we introduced a way to support Confidential Container's feature of pulling the image in the guest VM.
When this happens, cri-o is still pulling the image on the host, because kubernetes keeps sending the "PullImageRequest", and cri-o has no reason to not process it.

This has two drawbacks:

  1. it is a waste of time and resources, as the image that crio will pull and prepare will not be used anyway
  2. it can block the actual execution of the container in the Confidential Container use case: if the image is encrypted, and the key is not shared with crio, then crio will fail to prepare the image because it can't read it.
    This failure will block the container creation, as kubernetes will not proceed with CreateContainer if PullImage failed.

This PR is meant to make crio skip the pull image phase when the runtime is configured with the "runtime_pull_image" flag introduced in #7471

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

I have modified the code in crio to skip the pull image processing when the runtime is configured with the "runtime_pull_image" flag.
Crio then just reports a success status to kubernetes, which is happy with that, even when subsequent ImageList or ImageStatus report the image is missing.
This is the first commit of this PR (5c8d55f).

Now this is not enough: further on, crio still need to access the metadata of the image, and this is failing because it was not pulled.
This is happening in various places of the createSandboxContainer() function.
I've tried to fix that in various ways, but each failure that I fix leads to another one down the road.

An idea that I had was to let crio use the pause image as a fake rootfs to work on, letting the runtime deal with the actual image. This is what I have in the second commit of this PR. (df446ee)
But this is still causing issues as then the container tries to run "/pause", as instructed by the pause image config that crio read and sends to the runtime.

At this point, I realize that I miss a higher level standpoint to understand how image pull/prepare is done, what crio actually need, and what we can skip.

My understanding of container encryption is that only the layers are encrypted, not the config or manifest.
Is there a way for crio to get the config/manifest it needs from the repository, without getting the layers?
Would that be enough to go through this function and proceed with the container creation?

Any help is welcome.

Does this PR introduce a user-facing change?

None

…d to handle it

The RuntimePullImage flag tells crio that the runtime will handle the image by
itself. crio should not try to pull the image in that case.

This change is not only an optimization (avoiding the pull for an image that
won't be used).
It also prevents a failure if the image is encrypted, like can be the case in
the confidential container use case, and crio doesn't have the key to read it.
In that situation, pullimage would fail, and the process would stop before
the runtime has a chance to do the job.

Signed-off-by: Julien Ropé <jrope@redhat.com>
…ing up the name/ID from the store

As the image will be pulled by the runtime, it may not be accessible in the store.
We must avoid failing in this case.

Signed-off-by: Julien Ropé <jrope@redhat.com>
@openshift-ci openshift-ci bot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Apr 12, 2024
@openshift-ci openshift-ci bot requested review from klihub and wgahnagl April 12, 2024 09:02
Copy link
Contributor

openshift-ci bot commented Apr 12, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: littlejawa
Once this PR has been reviewed and has the lgtm label, please assign saschagrunert for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Apr 12, 2024

@littlejawa: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ci-fedora-critest df446ee link true /test ci-fedora-critest
ci/prow/ci-rhel-critest df446ee link true /test ci-rhel-critest
ci/prow/ci-e2e-evented-pleg df446ee link true /test ci-e2e-evented-pleg

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link

A friendly reminder that this PR had no activity for 30 days.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. release-note-none Denotes a PR that doesn't merit a release note.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant