KEP-3063: DRA: 1.30 update #4181

pohly · 2023-09-05T10:27:21Z

One-line PR description: This updates the README to reflect what has been done and fills in sections that were left out earlier. The next milestone is 1.30 where DRA will remain in alpha.
Issue link: dynamic resource allocation #3063

pohly · 2023-09-05T10:39:11Z

Some work remains for the 1.29 development cycle:

resource quota mechanism
controller metrics about creating ResourceClaim from template
integration with unexpected node shutdown
autoscaler integration
There are known bugs in the scheduler which cause test flakes. PRs with proposed fixes are pending.

The KEP needs to be merged, then that work needs to be done and/or finished, and then before the actual beta graduation we need to do another review round to determine whether beta graduation criteria have been met.

alculquicondor · 2023-09-05T13:18:33Z

/sig scheduling
/sig autoscaling

alculquicondor · 2023-09-05T14:17:33Z

/cc @alculquicondor @mwielgus

bart0sh · 2023-09-06T07:28:49Z

/assign @bart0sh @klueska
/lgtm

bart0sh · 2023-09-06T07:29:42Z

/assign @mrunalp @derekwaynecarr @dchen1107 @SergeyKanzhelev

alculquicondor · 2023-09-06T15:02:47Z

/hold
for all involved SIGs to review and approve.

As discussed on Slack, scale down must determine whether some currently running pods could get moved. This simulation depends on simulating deallocation, otherwise the allocated claim prevents moving pods.

The discussion around autoscaling needs more time.

Technically the support for cluster autoscaling can be defined and implemented as an extension of the core DRA, without changing the core feature. By separating out the specification of "numeric parameters" into a separate KEP it might be easier to make progress on the different aspects because they are better separated.

kubernetes/kubernetes#121876 changed where the cluster gets updated with blocking API calls.

alculquicondor

I'll give another pass after @johnbelamaric gives LGTM

alculquicondor · 2024-01-31T16:43:22Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

+schedulable, like creating a claim that the pod needs or finishing the
+allocation of a claim.
+
+[Queuing hints](https://github.com/kubernetes/enhancements/issues/4247) are


FYI that queueing hints are disabled by default and are unlikely to reach a stable state in 1.30 to be re-enabled. In case this is crucial for the usability of DRA.

+1. For this KEP update, we will have to state the status of DRA w/o usage of queueing hints.

It's not crucial. Performance is different, but DRA is usable also without that optimization. I know that you disagree, but all the use cases in https://docs.google.com/document/d/1XNkTobkyz-MyXhidhTp5RfbMsM-uRCWDoflUMqNcYTk/edit#heading=h.ljj9kaa144nr confirmed that scheduling performance is not a priority because of long-running pods.

I'll update the text to explain this.

I added something.

alculquicondor · 2024-01-31T17:23:49Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

-```
+This is not possible with opaque parameters as described in this KEP. If a DRA
+driver developer wants to support Cluster Autoscaler, they have to use numeric
+parameters. Numeric parameters are an extension of this KEP that is defined in


I don't think we should phrase this case as only being an enabler for cluster autoscaler. I see that KEP as an opportunity to simplify this KEP before it goes to beta.

I can add that.

It didn't fit into this "Autoscaler" section, so I added a new section for this thought under "Alternatives".

Huang-Wei

The scheduling (except for support CA) part looks good. A new nits.

Huang-Wei · 2024-01-31T18:16:18Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

+schedulable, like creating a claim that the pod needs or finishing the
+allocation of a claim.
+
+[Queuing hints](https://github.com/kubernetes/enhancements/issues/4247) are


+1. For this KEP update, we will have to state the status of DRA w/o usage of queueing hints.

Huang-Wei · 2024-01-31T18:16:48Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

+go through the backoff queue and the usually 5 second long delay associated
+with that.
+
+#### PreEnqueue


FWIW, PreEnqueue is likely to reach GA in 1.30.

Huang-Wei · 2024-01-31T18:18:41Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

@@ -1889,6 +1905,12 @@ At the moment, the claim plugin has no information that might enable it to
 prioritize which resource to deallocate first. Future extensions of this KEP
 might attempt to improve this.

+This is currently using blocking API calls. They are unlikely because this


Suggested change

This is currently using blocking API calls. They are unlikely because this

This is currently using blocking API calls. It's quite rare because this

"Numeric parameters" are now called "semantic parameters" because they are not just about numbers.

alculquicondor

/approve
from sig scheduling
/hold for @johnbelamaric's LGTM

johnbelamaric · 2024-02-02T23:45:15Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

+
+    When pods fail to get scheduled, kube-scheduler reports that through events
+    and pod status. For DRA, that includes "waiting for resource driver to
+    provide information" (node not selected yet) and "waiting for resource


Do these events/status updates include information about the specific claim/driver that is blocking progress?

When waiting for a claim, the claim is mentioned. This allows the user to drill down and check the claim.

When waiting for PodSchedulingContext information, that object is not mentioned explicitly. It doesn't need to be named because the name is the same as for the pod.

In both cases it is assumed that users understand the concepts enough to know what "claim" and "pod scheduling context" are.

johnbelamaric · 2024-02-02T23:48:54Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

@@ -2769,6 +2828,18 @@ Why should this KEP _not_ be implemented?

 ## Alternatives

+### Semantic Parameters instead of PodSchedulingContext
+
+When a DRA driver uses semantic parameters, there is no DRA driver controller


I believe in the latest cut this has shifted a little? In the semantic parameter version, there is still a driver controller but its responsibility is to evaluate the claim parameters and produce the driver-neutral semantic request. Since we are skipping CEL for this (for now at least...). But the following text on "we might be able to remove PodSchedulingContext" looks right.

I change the text into "there is no need for a DRA driver controller
which allocates the claim and no need for communication between scheduler and such a controller"

That leaves it open whether a controller is still needed for other purposes (will be defined in the "semantic parameters KEP" and may depend on whether users are allowed to create in-tree parameter objects directly) and just focuses on the aspect that is relevant here.

johnbelamaric · 2024-02-02T23:51:25Z

keps/sig-node/3063-dynamic-resource-allocation/README.md

@@ -2894,6 +2965,52 @@ type ResourceDriverFeature struct {
 }
 ```

+### Complex sharing of ResourceClaim
+
+At the moment, the allocation result marks as a claim as either "shareable" by


There's something not quite right in that model. I wonder if the underlying semantic models can express this?

Also, there are different kinds of "shareable" - for example, I know of use cases where you want to mount the GPU devices in a monitoring container - or perhaps to access it via some config tools.

No change needed, just raising some things to think about post 1.30.

johnbelamaric · 2024-02-02T23:56:21Z

Couple minor comments but overall LGTM. What we are saying here is:

Main DRA stays mostly the same in 1.30 and stays in alpha.
Semantic models are implemented via DRA: structured parameters #4381 in 1.30 alpha
We learn from that whether we can simplify this KEP, removing PodSchedulingContext (perhaps with a new, different "escape hatch" that is part of semantic models).

dchen1107 · 2024-02-08T22:48:57Z

/lgtm from SIG Node perspective
/approve based on @johnbelamaric @alculquicondor @Huang-Wei's comments.

k8s-ci-robot · 2024-02-08T22:49:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, dchen1107, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-node/OWNERS~~ [dchen1107]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

johnbelamaric · 2024-02-08T22:49:32Z

/lgtm

I am root in this repo so I can't do approve without it going through (and I am not really a node approver)

mrunalp · 2024-02-08T23:51:05Z

/hold cancel

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. labels Sep 5, 2023

k8s-ci-robot requested review from dchen1107 and derekwaynecarr September 5, 2023 10:27

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 5, 2023

pohly mentioned this pull request Sep 5, 2023

dynamic resource allocation #3063

Open

34 tasks

k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. labels Sep 5, 2023

k8s-ci-robot requested review from alculquicondor and mwielgus September 5, 2023 14:17

bart0sh mentioned this pull request Sep 6, 2023

[release/1.6] Backport CDI to the Containerd 1.6 containerd/containerd#8943

Closed

bart0sh added this to Triage in SIG Node PR Triage Sep 6, 2023

k8s-ci-robot assigned bart0sh and klueska Sep 6, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2023

bart0sh moved this from Triage to Needs Approver in SIG Node PR Triage Sep 6, 2023

k8s-ci-robot assigned dchen1107, derekwaynecarr, mrunalp and SergeyKanzhelev Sep 6, 2023

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 6, 2023

pohly force-pushed the dra-beta branch from d18303c to 4a61ffd Compare September 7, 2023 10:01

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 7, 2023

pohly added 7 commits January 31, 2024 13:36

DRA: include SimulateEvictPod for scale down, language tweaks

d4cb60c

As discussed on Slack, scale down must determine whether some currently running pods could get moved. This simulation depends on simulating deallocation, otherwise the allocated claim prevents moving pods.

DRA: user story for autoscaling and fallback code

b9c55d8

DRA: review feedback

4331c89

DRA: continue to target alpha in 1.29

3a55078

The discussion around autoscaling needs more time.

DRA: update for 1.30

2db47ba

DRA: document PreBind

fa3c5ce

kubernetes/kubernetes#121876 changed where the cluster gets updated with blocking API calls.

pohly force-pushed the dra-beta branch from fea9511 to fa3c5ce Compare January 31, 2024 12:36

alculquicondor reviewed Jan 31, 2024

View reviewed changes

Huang-Wei reviewed Jan 31, 2024

View reviewed changes

pohly added 3 commits February 1, 2024 15:40

DRA: clarifications around semantic parameters

9721e4c

"Numeric parameters" are now called "semantic parameters" because they are not just about numbers.

DRA: review feedback

11f65cc

DRA: fix TOC

2604296

alculquicondor reviewed Feb 1, 2024

View reviewed changes

johnbelamaric reviewed Feb 2, 2024

View reviewed changes

DRA: review feedback

7afe6af

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2024

k8s-ci-robot assigned johnbelamaric Feb 8, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2024

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 8, 2024

k8s-ci-robot merged commit 646c6c8 into kubernetes:master Feb 9, 2024
4 checks passed

SIG Node PR Triage automation moved this from Needs Reviewer to Done Feb 9, 2024

k8s-ci-robot added this to the v1.30 milestone Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP-3063: DRA: 1.30 update #4181

KEP-3063: DRA: 1.30 update #4181

pohly commented Sep 5, 2023 •

edited

pohly commented Sep 5, 2023 •

edited

alculquicondor commented Sep 5, 2023

alculquicondor commented Sep 5, 2023

bart0sh commented Sep 6, 2023

bart0sh commented Sep 6, 2023

alculquicondor commented Sep 6, 2023

alculquicondor left a comment

alculquicondor Jan 31, 2024

Huang-Wei Jan 31, 2024

pohly Jan 31, 2024

pohly Feb 1, 2024

alculquicondor Jan 31, 2024

pohly Jan 31, 2024

pohly Feb 1, 2024

Huang-Wei left a comment

Huang-Wei Jan 31, 2024

Huang-Wei Jan 31, 2024

Huang-Wei Jan 31, 2024

alculquicondor left a comment

johnbelamaric Feb 2, 2024

pohly Feb 5, 2024

johnbelamaric Feb 2, 2024

pohly Feb 5, 2024

johnbelamaric Feb 2, 2024

johnbelamaric commented Feb 2, 2024

dchen1107 commented Feb 8, 2024

k8s-ci-robot commented Feb 8, 2024

johnbelamaric commented Feb 8, 2024

mrunalp commented Feb 8, 2024

	This is currently using blocking API calls. They are unlikely because this
	This is currently using blocking API calls. It's quite rare because this

KEP-3063: DRA: 1.30 update #4181

KEP-3063: DRA: 1.30 update #4181

Conversation

pohly commented Sep 5, 2023 • edited

pohly commented Sep 5, 2023 • edited

alculquicondor commented Sep 5, 2023

alculquicondor commented Sep 5, 2023

bart0sh commented Sep 6, 2023

bart0sh commented Sep 6, 2023

alculquicondor commented Sep 6, 2023

alculquicondor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Huang-Wei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alculquicondor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnbelamaric commented Feb 2, 2024

dchen1107 commented Feb 8, 2024

k8s-ci-robot commented Feb 8, 2024

johnbelamaric commented Feb 8, 2024

mrunalp commented Feb 8, 2024

pohly commented Sep 5, 2023 •

edited

pohly commented Sep 5, 2023 •

edited