New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: KEP-4381: DRA: network-attached resources #4612
base: master
Are you sure you want to change the base?
Conversation
Adding support for network-attached resources by extending the ResourceSlice with a node selector is fairly easy. The (one!) scheduler in the cluster can use that field during Filter instead of the node name. Supporting multiple schedulers is harder and needs further work.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pohly The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -546,6 +548,12 @@ the DRA drivers providing content for those objects. It might be possible to | |||
support version skew (= keeping kubelet at an older version than the control | |||
plane and the DRA drivers) in the future, but currently this is out of scope. | |||
|
|||
For network-attached resources, the DRA driver is responsible for discovering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, ResourceSlice is generated with the name <node_name>-<driver_name>-<random_string>
, but if setting the NodeSelector of ResourceSlice, would it be <driver_name>-<random_string>
?
https://github.com/kubernetes/kubernetes/blob/v1.30.0/pkg/kubelet/cm/dra/plugin/noderesources.go#L470
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to name those ResourceSlices would be entirely up to the driver. What matters isn't the name, only the content.
I am moving the ResourceSlice controller out of kubelet into the k8s.io/dynamic-resource-allocation
package as part of kubernetes/kubernetes#124274, so drivers could reuse that (eventually - right now in that PR it doesn't support network-attached resources yet).
When 'Network-attached resources' is said what exactly is the scope of this? Could it be a simple veth case or is this leaning more towards virtual functions? |
This PR is not about network hardware in a node. That kind of resource is local to a node and already covered. What this PR adds is support for things like an IP camera (accessible through the IP network) or special devices that can be accessed through some kind of fabric (GPU via PCI switch). Those resources are not local to a node and therefore need to be handled differently. |
This corresponds to kubernetes/enhancements#4612. The ResourcePool change is small. The big caveat as mentioned in the KEP update is that multiple schedulers will not coordinate allocation of these shared devices, so a cluster with such devices will be limited to running a single scheduler.
One-line PR description: nework-attached resources
Issue link: DRA: structured parameters #4381
Other comments:
Adding support for network-attached resources by extending the ResourceSlice with a node selector is fairly easy. The (one!) scheduler in the cluster can use that field during Filter instead of the node name.
Supporting multiple schedulers is harder and needs further work.