Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compose a Workload object using PodTemplate objects #1004

Open
3 tasks done
alculquicondor opened this issue Jul 20, 2023 · 6 comments · May be fixed by #1169
Open
3 tasks done

Compose a Workload object using PodTemplate objects #1004

alculquicondor opened this issue Jul 20, 2023 · 6 comments · May be fixed by #1169
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@alculquicondor
Copy link
Contributor

alculquicondor commented Jul 20, 2023

What would you like to be added:

A field in the PodSet struct that references a Pod or PodTemplate object

type PodSet struct {
  Name string
  Count int32
  Template *corev1.PodTemplateSpec // keep for backwards compatibility
  PodRef *TemplateReference
  PodTemplateRef *TemplateReference
}

Only one of (Template, PodRef or PodTemplateRef) can be set.

Caveats:

  • A Workload wouldn't be ready for admission until Kueue has seen all the referenced PodTemplate objects.
  • The PodTemplate objects should have a label that allows to:
    • reference the Job/MPIJob/etc object
    • filter from a webhook so that we can implement immutability checks.
  • The Job/MPIJob/etc should be the owner of the PodTemplate object so that it can be deleted in cascade when the job is deleted.

Why is this needed:

  1. We can rely on the apiserver validation for the PodTemplate object.
  2. Avoid duplication of the template on a ProvisioningRequest Introduce AEP with Provisioning Request CRD kubernetes/autoscaler#5848
  3. Allow the Workload object to host more specs without worrying about etcd size limits
  4. For Managing raw pods  #976, it feels overkill to have to duplicate the entire podspec just to represent one Pod

If JobSet were to implement a similar approach, we would also reuse the same PodTemplate objects created by users.

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

@alculquicondor alculquicondor added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 20, 2023
@alculquicondor
Copy link
Contributor Author

cc @ahg-g

@stuton
Copy link
Contributor

stuton commented Jul 21, 2023

/assign

@alculquicondor
Copy link
Contributor Author

I discussed this offline with @mwielgus and he suggested to drop the PodRef field, as it adds extra complication around watchers, while providing marginal benefit.

@stuton stuton linked a pull request Sep 29, 2023 that will close this issue
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 26, 2024
@tenzen-y
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 26, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants