Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add two feature gates : Image based deployment and additional runtimeClasses #394

Closed
wants to merge 16 commits into from

Conversation

bpradipt
Copy link
Contributor

@bpradipt bpradipt commented Apr 6, 2024

This is a feature enhancement adding two feature gates which typically goes together.

  • Image based deployment and additional runtimeClasses
    An example configmap enabling the feature
apiVersion: v1
kind: ConfigMap
metadata:
  name: osc-feature-gates
  namespace: openshift-sandboxed-containers-operator
data:
  LayeredImageDeployment: "false"
  AdditionalRuntimeClasses: "false"

Sample ConfigMap with configs for individual features

apiVersion: v1
kind: ConfigMap
metadata:
  name: layeredimagedeployment-config
  namespace: openshift-sandboxed-containers-operator
data:
  osImageURL: "quay.io/bpradipt/coco:ocp415"
  kernelArguments: "kvm_intel.tdx=1"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: additionalruntimeclasses-config
  namespace: openshift-sandboxed-containers-operator
data:
  runtimeClassConfig: "kata-cc-tdx, kata-cc-sim"

This code can be tested on an OCP cluster by following the instructions from this repo:
https://github.com/bpradipt/coco-install/tree/main/osc-image-deploy
SNO should be fine as well for testing.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 6, 2024
Copy link

openshift-ci bot commented Apr 6, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@bpradipt bpradipt force-pushed the image-deployment branch 6 times, most recently from 6469646 to b86118a Compare April 9, 2024 16:01
@bpradipt bpradipt removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2024
@bpradipt bpradipt marked this pull request as ready for review April 9, 2024 16:02
@openshift-ci openshift-ci bot requested review from cpmeadors and jensfr April 9, 2024 16:04
@bpradipt bpradipt requested review from snir911 and pmores April 9, 2024 16:08
@bpradipt
Copy link
Contributor Author

@spotlesstofu ptal

@spotlesstofu
Copy link
Contributor

What's the case for exposing the RuntimeClasses definition to the user? I understand that's the current custom in the CoCo operator, but I couldn't find out why.

@bpradipt
Copy link
Contributor Author

What's the case for exposing the RuntimeClasses definition to the user? I understand that's the current custom in the CoCo operator, but I couldn't find out why.

Good point @spotlesstofu. When using image based deployment, it may require additional runtimeclass to be created depending on the configs included in the image.
Likewise when enabling CoCo, additional runtimeclass will be required. One idea was to include runtimeClass as a feature specific config param like this
eg

imageBasedDeployment.params: |
    osImageURL=quay.io/....
    kernelArguments=a=b c=d ...
    runtimeClassConfig=name1:cpuOverHead1:memOverHead1, name2:cpuOverHead2:memOverHead2

Likewise for any feature that might need additional runtimeclasses they can have a feature.params.

However, I was unsure whether to use the above approach or make runtimeClass config as a separate feature itself.

What is your suggestion?

@spotlesstofu
Copy link
Contributor

So I understand that the node configuration (image, kernel, etc.) is tied to the RuntimeClass. Then, what we should expose to the user is a CustomResource (CR) to represent this. A KataRuntimeClass, to instantiate multiple times for each runtime that's needed.

This CR would include the node configuration that's desired for that runtime. The operator should take care of composing the multiple KataRuntimeClass and produce a working configuration. Also the default runtime configuration would be moved to this CR at some point.

Does this make sense, would it address our requirements?

@bpradipt
Copy link
Contributor Author

So I understand that the node configuration (image, kernel, etc.) is tied to the RuntimeClass. Then, what we should expose to the user is a CustomResource (CR) to represent this. A KataRuntimeClass, to instantiate multiple times for each runtime that's needed.

Node configuration may or may not be tied to a runtimeclass. But if the image do provide configuration for additional runtimeclasses (crio configs), then we need a way to expose these to end-user via creating runtimeclass objects
Using CR seems too heavy weight for this imho

This CR would include the node configuration that's desired for that runtime. The operator should take care of composing the multiple KataRuntimeClass and produce a working configuration. Also the default runtime configuration would be moved to this CR at some point.

Does this make sense, would it address our requirements?

Like with everything in K8s using CR is definitely a way to implement. We might as well decide to include these in KataConfig CR with additional attributes. Question is do we want to use CR or let it be part of the feature configmap ..

@bpradipt bpradipt force-pushed the image-deployment branch 3 times, most recently from 6e7a9e8 to 84d9f52 Compare April 17, 2024 10:43
internal/featuregates/featuregates.go Show resolved Hide resolved
internal/featuregates/features.go Outdated Show resolved Hide resolved
controllers/cm_event_handler.go Show resolved Hide resolved
Additionally include the following changes:
1. Include separate logger for the package
2. Switch to a status map to keep track of FeatureGate status.
This enables tracking multiple feature gates in the reconcile loop
and avoid calling the Kube API for individual feature gates

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
@bpradipt
Copy link
Contributor Author

@beraldoleal @snir911 I have addressed all your comments. PTAL

OpenShift supports image layering, that allows deployment of packages by
providing a custom RHCOS image.
You need to create a MachineConfig with the custom RHCOS image url and
that's about it.

This commit adds support for using custom RHCOS image url to deploy Kata
components as an alternative to using the extension mechanism.

This capability is behind a feature gate and you'll need to set
LayeredImageDeployment to true in osc-feature-gates ConfigMap.

Additionally you'll need to set additional config params like shown
below for the layeredimagedeployment-config configmap:

apiVersion: v1
kind: ConfigMap
metadata:
name: osc-feature-gates
namespace: openshift-sandboxed-containers-operator
data:
LayeredImageDeployment: "true"

---

apiVersion: v1
kind: ConfigMap
metadata:
name: layeredimagedeployment-config
namespace: openshift-sandboxed-containers-operator
data:
osImageURL: "quay.io/...."
kernelArguments: "a=b c=d ..."

Logs an error and reconcile if LayeredImageDeployment
feature is enabled and configmap is not provided

Fixes: #KATA-2902

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
With the previous code of predicate func, the reconcile loop was
not receiving the KataConfig struct. Instead the reconcile loop was
called with the OSC featuregate configmap struct.
The new method ensures that for featuregate configMap changes,
the reconcile loop is called with the KataConfig struct.

Fixes: #KATA-2902, #KATA-2903
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Modify createRuntimeClass function to createRuntimeClasses
This lays the foundation to support different runtimeClasses
for CoCo like kata-cc-tdx, kata-cc-snp, kata-cc-remote etc.

Fixes: #KATA-2903

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
This is enabled via AdditionalRuntimeClasses feature gate.
Example configuration is shown below:

apiVersion: v1
kind: ConfigMap
metadata:
  name: osc-feature-gates
  namespace: openshift-sandboxed-containers-operator
data:
  AdditionalRuntimeClasses: "true"

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: additionalruntimeclasses-config
  namespace: openshift-sandboxed-containers-operator
data:
  runtimeClassConfig: "name1:cpuOverHead1:memOverHead1, name2:cpuOverHead2:memOverHead2"
  #runtimeClassConfig: "name1, name2"

Also rearrange the code to have all the runtimeClass handling in a
single code file for easier readability

This feature typically will be used with other features.
For example this feature can be used with the image based deployment feature as well.
The image based deployment might have configuration for multiple
runtimeclasses which is then exposed to users as RuntimeClass objects
created by OSC operator when this feature gate is enabled.

Logs an error if AdditionalRuntimeClasses feature is enabled
and config is not provided

Fixes: #KATA-2903
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Add examples for LayeredImageDeployment and
AdditionalRuntimeClasses feature

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
IsEnabled method is no longer needed. Hence removing it

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
This aligns with its usage

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Featuregate status configmap is initialised by the operator
itself.

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Check for featuregate status configmap in the beginning of the reconcile
loop and don't proceed if it's missing

This ensures we don't take any action if we cannot determine the
FeatureGateStatus which could be due to an error in fetching the
ConfigMap or the ConfigMap not being present

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
We had different methods to read yaml files depending on the
kubernetes object. However this is really not needed as all the methods
simply were reading files with the only difference being the flie
location.
This commit simplifies the code and uses a single readYamlFile method.
The caller should set the full path of the yaml file to be read.
Further any additional requirements to read different yaml files will
no longer need a new method

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
This avoids confusion with the individual feature gate configmap
and also correctly reflects the intent of the variable

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Since fg status configmap is must and created during operator
initialisation, we need to ensure on KataConfig deletion it's reverted
back to default values.

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Copy link

openshift-ci bot commented Apr 29, 2024

@bpradipt: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/sandboxed-containers-operator-e2e 5f47dc5 link false /test sandboxed-containers-operator-e2e
ci/prow/check 5f47dc5 link false /test check

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@@ -66,14 +64,14 @@ type KataConfigOpenShiftReconciler struct {
kataConfig *kataconfigurationv1.KataConfig
FeatureGates *featuregates.FeatureGates
FeatureGatesStatus featuregates.FeatureGateStatus
deploymentState DeploymentState
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite afraid this breaks the controller's idempotency. The controller can be migrated to another worker at any time and if I'm not mistaken this will wipe the deployment state information. This is not a theoretical concern either. The controller normally runs on workers so any worker reboot is an opportunity for the controller to be killed and restarted elsewhere. I've seen many times during kata deployment on all workers of a two-worker cluster the controller being migrated twice (ie. the number of workers). This happens when the controller happens to be running on the first worker to get kata deployed. It's then migrated to the second worker, only to be migrated back to the first worker when the second worker has kata deployed on it subsequently.

Instead of storing the deployment state here, we should ideally figure out the deployment state by examining the cluster. In case that's impossible or impractical (I don't actually know if that's possible but I suspect it might be) the state should still be stored somewhere in the cluster, perhaps in KataConfig.status.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the state should still be stored somewhere in the cluster, perhaps in KataConfig.status

Actually, no, status is most likely no good for this either since as per k8s principles the pertinent resource controller is expected to recover even if its resource's status subobject is lost.

@@ -19,7 +19,7 @@ type FeatureGates struct {
// FeatureGate Status Struct map of string and bool
type FeatureGateStatus map[string]bool

var DefaultFeatureGates = map[string]bool{
var DefaultFeatureGatesStatus = map[string]bool{
Copy link
Contributor

@pmores pmores Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, was just about to propose this change! :-)

Though this should be squashed to a previous commit, really.

// Which could be due to an error in fetching the ConfigMap or the ConfigMap not being present
if r.FeatureGatesStatus == nil {
r.Log.Info("FeatureGateStatus is nil, cannot proceed. Check if the feature gate configmap is present")
return ctrl.Result{}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed elsewhere so here just briefly for public record: I believe that just continuing here as if a default-valued feature gate configmap was present would lead to both simpler implementation and better user experience.

yamlData, err := readJobYAML(jobFileName)
jobYamlFile := filepath.Join(peerpodsImageJobsPathLocation, jobFileName)

yamlData, err := readYamlFile(jobYamlFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I'm confused but I don't see how this is related to feature gating. (?)

LayeredImageDeploymentConfig = "layeredimagedeployment-config"
AdditionalRuntimeClasses = "AdditionalRuntimeClasses"
AdditionalRuntimeClassesConfig = "additionalruntimeclasses-config"
FeatureGatesStatusConfigMapName = "osc-feature-gates"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, a good change but should be squashed with the commit where the original name was first introduced.

@@ -888,6 +888,13 @@ func (r *KataConfigOpenShiftReconciler) processKataConfigDeleteRequest() (ctrl.R
return ctrl.Result{Requeue: true}, nil
}

// Revert the feature gate status configmap to default
err = r.FeatureGates.RevertFeatureGateStatusConfigMapToDefault(context.TODO())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be cleaner, and simpler too, to just remove the cm here. As I said elsewhere, I'm far from convinced that we should insist on this configmap's existence in the first place, but it definitely serves no purpose when there's even no KataConfig instance, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm also not sure why do we need to update the user's CM? This might look strange from the user's PoV having some settings changed. We should not mix default values and user's values. Whoever create it shouldn't be the one responsible for deleting?

Not a k8s expert, and I was not following all the discussions here closely, so I'm probably missing something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good point, we should avoid muddying the notion of resource ownership to avoid problems like we had a while ago with KataConfig.spec.kataMonitorImage which was basically an OSC-owned field within a user-owner KataConfig instance.

However, my takeaway from the latest round of discussions is that we're now leaning towards the controller not requiring the existence of any of the feature-gating configmaps. This solves the ownership issue quite neatly by putting the configmap resources wholly under user control.

if r.deploymentState.PreviousMethod != "image" {
// Rollback the previous deployment
if r.deploymentState.PreviousMcName != "" {
// Delete the previous MachineConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we should just check for the existence of the extension_mc_name MC and delete it if it exists.

if r.deploymentState.PreviousMethod != "extension" {
// Rollback the previous deployment
if r.deploymentState.PreviousMcName != "" {
// Delete the previous MachineConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with image_mc_name MC.

return ctrl.Result{Requeue: true}, err
}
}
wasMcJustCreated, err = r.createMc(machinePool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure admittedly but is the case handled where the user stays with the image-based deployment mechanism but has changed the image URL? Because I suspect it's not.

if err != nil {
r.Log.Info("Error creating image MachineConfig for Image based deployment", "err", err)
return ctrl.Result{Requeue: true}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it guaranteed that if we delete the previous MC and create the new one right away, the MCO will handle this correctly and sequence the operations properly, is the result well-defined? If not or if we can't tell we'd better return and reschedule after the deletion and only create the new MC once we can verify that the operation triggered by the deletion of the previous one has finished.

@bpradipt
Copy link
Contributor Author

Based on internal discussions I'll close this PR and split this into separate PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants