Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quota support for PVC with VolumeAttributesClass #121895

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

carlory
Copy link
Member

@carlory carlory commented Nov 15, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

A cluster admin wants to control costs while giving application developers the freedom to optimize their applications. They set a per VolumeAttributesClass limit to the maximum count of PVCs that can be specified in a cluster ResourceQuota. When an application dev modifies a PVC to request a higher tier of VolumeAttributesClass but there is no quota, the request is rejected. As the ResourceQuota is a cluster-scoped object, only the cluster admin and not application devs can change limits.

An example of defining ResourceQuota for VolumeAttributesClass:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: vacquota
spec:
  hard:
  // Across all persistent volume claims associated with the 
  // <volume-attributes-class-name>, the total number of persistent volume claims 
  // that can exist in the namespace.
<volume-attributes-class-name>.volumeattributesclass.storage.k8s.io/persistentvolumeclaims: "5" 
  ...

Which issue(s) this PR fixes:

Fixes #119299

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Add quota support for PVC with VolumeAttributesClass.  For example, if you want to quota storage by a volume attributes class, you would have a declaration that follows <volume-attributes-class>.volumeattributesclass.storage.k8s.io/persistentvolumeclaims.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3751-volume-attributes-class

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 15, 2023
@carlory
Copy link
Member Author

carlory commented Nov 15, 2023

/cc @sunnylovestiramisu

@carlory
Copy link
Member Author

carlory commented Nov 15, 2023

/sig storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 15, 2023
@carlory
Copy link
Member Author

carlory commented Nov 15, 2023

/test pull-kubernetes-e2e-gce

@jiahuif
Copy link
Member

jiahuif commented Nov 28, 2023

/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Nov 28, 2023
@carlory
Copy link
Member Author

carlory commented Dec 14, 2023

/cc @sunnylovestiramisu @msau42

@k8s-ci-robot k8s-ci-robot added area/test sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Dec 21, 2023
@carlory
Copy link
Member Author

carlory commented Dec 21, 2023

/test pull-kubernetes-e2e-kind
/test pull-kubernetes-e2e-kind-ipv6

@carlory carlory force-pushed the kep-3751-quota branch 2 times, most recently from af8174e to 2253c37 Compare December 21, 2023 09:21
@carlory
Copy link
Member Author

carlory commented Dec 21, 2023

Here's an example how to run e2e on macOS.

Terminal 1: setup local cluster with VolumeAttributesClass enabled.

➜  kubernetes git:(kep-3751-quota) sudo FEATURE_GATES=VolumeAttributesClass=true ./hack/local-up-cluster.sh
Password:
...
Local Kubernetes cluster is running. Press Ctrl-C to shut it down.

Configurations:
  /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/local-up-cluster.sh.XXXXXX.Emp1teaU/kube-audit-policy-file
  /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/local-up-cluster.sh.XXXXXX.Emp1teaU/kube-scheduler.yaml
  /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/local-up-cluster.sh.XXXXXX.Emp1teaU/kube-serviceaccount.key
  /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/local-up-cluster.sh.XXXXXX.Emp1teaU/kube_egress_selector_configuration.yaml

Logs:
  /tmp/etcd.log
  /tmp/kube-apiserver.log
  /tmp/kube-controller-manager.log


  /tmp/kube-scheduler.log


To start using your cluster, you can open up another terminal/tab and run:

  export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
  cluster/kubectl.sh
...

Terminal 2: create a fake node (it's required for e2e tests setup) and run kwok to maintain the fake node.

➜  kubernetes git:(kep-3751-quota) kubectl apply --kubeconfig /var/run/kubernetes/admin.kubeconfig -f - <<EOF
apiVersion: v1
kind: Node
metadata:
  annotations:
    node.alpha.kubernetes.io/ttl: "0"
    kwok.x-k8s.io/node: fake
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: kwok-node-0
    kubernetes.io/os: linux
    kubernetes.io/role: agent
    node-role.kubernetes.io/agent: ""
    type: kwok
  name: kwok-node-0
spec:
status:
  allocatable:
    cpu: 32
    memory: 256Gi
    pods: 110
  capacity:
    cpu: 32
    memory: 256Gi
    pods: 110
  nodeInfo:
    architecture: amd64
    bootID: ""
    containerRuntimeVersion: ""
    kernelVersion: ""
    kubeProxyVersion: fake
    kubeletVersion: fake
    machineID: ""
    operatingSystem: linux
    osImage: ""
    systemUUID: ""
  phase: Running
EOF

➜  kubernetes git:(kep-3751-quota) kwok \
  --kubeconfig=/var/run/kubernetes/admin.kubeconfig \
  --manage-all-nodes=false \
  --manage-nodes-with-annotation-selector=kwok.x-k8s.io/node=fake \
  --manage-nodes-with-label-selector= \
  --manage-single-node= \
  --disregard-status-with-annotation-selector=kwok.x-k8s.io/status=custom \
  --disregard-status-with-label-selector= \
  --cidr=10.0.0.1/24 \
  --node-ip=10.0.0.1 \
  --node-lease-duration-seconds=40

Terminal 3: Run e2e tests.

➜  ginkgo --focus "should create a ResourceQuota and capture the life of a persistent volume claim with a volume attributes class" -v test/e2e -- --kubeconfig=/var/run/kubernetes/admin.kubeconfig
  Dec 21 17:40:30.327: INFO: The --provider flag is not set. Continuing as if --provider=skeleton had been used.
  I1221 17:40:30.327668   58333 e2e.go:117] Starting e2e run "ffaf0a46-ce6a-4bf0-b2b7-e8a9c0962114" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e
=================================================================================================
Random Seed: 1703151598 - will randomize all specs

Will run 1 of 7408 specs
------------------------------
[ReportBeforeSuite]
/Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e/e2e_test.go:157
[ReportBeforeSuite] PASSED [0.000 seconds]
------------------------------
[SynchronizedBeforeSuite]
/Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e/e2e.go:77
  Dec 21 17:40:30.417: INFO: >>> kubeConfig: /var/run/kubernetes/admin.kubeconfig
  Dec 21 17:40:30.419: INFO: Waiting up to 30m0s for all (but 0) nodes to be schedulable
  Dec 21 17:40:30.436: INFO: Waiting up to 5m0s for all daemonsets in namespace 'kube-system' to start
  Dec 21 17:40:30.436: INFO: e2e test version: v0.0.0-master+$Format:%H$
  Dec 21 17:40:30.437: INFO: kube-apiserver version: v1.29.0-alpha.3.117+af8174e3caabf3
  Dec 21 17:40:30.437: INFO: >>> kubeConfig: /var/run/kubernetes/admin.kubeconfig
  Dec 21 17:40:30.438: INFO: Cluster IP family: ipv4
[SynchronizedBeforeSuite] PASSED [0.021 seconds]
...
[sig-api-machinery] ResourceQuota [Feature:VolumeAttributesClass] should create a ResourceQuota and capture the life of a persistent volume claim with a volume attributes class [sig-api-machinery, Feature:VolumeAttributesClass]
/Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e/apimachinery/resource_quota.go:1220
  STEP: Creating a kubernetes client @ 12/21/23 17:40:30.532
  Dec 21 17:40:30.532: INFO: >>> kubeConfig: /var/run/kubernetes/admin.kubeconfig
  STEP: Building a namespace api object, basename volume-attributes-class @ 12/21/23 17:40:30.532
  STEP: Waiting for a default service account to be provisioned in namespace @ 12/21/23 17:40:30.56
  STEP: Waiting for kube-root-ca.crt to be provisioned in namespace @ 12/21/23 17:40:30.563
  STEP: Counting existing ResourceQuota @ 12/21/23 17:40:30.565
  STEP: Creating a ResourceQuota @ 12/21/23 17:40:35.573
  STEP: Ensuring resource quota status is calculated @ 12/21/23 17:40:35.584
  STEP: Creating a PersistentVolumeClaim with volume attributes class @ 12/21/23 17:40:37.591
  STEP: Ensuring resource quota status captures persistent volume claim creation @ 12/21/23 17:40:37.619
  STEP: Not allowing a PersistentVolumeClaim to be created that exceeds remaining quota @ 12/21/23 17:40:39.625
  STEP: Deleting a PersistentVolumeClaim @ 12/21/23 17:40:39.63
  STEP: Ensuring resource quota status released usage @ 12/21/23 17:40:39.638
  Dec 21 17:40:41.644: INFO: Waiting up to 7m0s for all (but 0) nodes to be ready
  STEP: Destroying namespace "volume-attributes-class-8151" for this suite. @ 12/21/23 17:40:41.648
• [11.125 seconds]
...
[SynchronizedAfterSuite]
/Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e/e2e.go:88
  Dec 21 17:40:41.673: INFO: Running AfterSuite actions on node 1
[SynchronizedAfterSuite] PASSED [0.000 seconds]
------------------------------
[ReportAfterSuite] Kubernetes e2e suite report
/Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e/e2e_test.go:161
[ReportAfterSuite] PASSED [0.000 seconds]
------------------------------

Ran 1 of 7408 Specs in 11.256 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 7407 Skipped
You're using deprecated Ginkgo functionality:
=============================================
  --ginkgo.slow-spec-threshold is deprecated --slow-spec-threshold has been deprecated and will be removed in a future version of Ginkgo.  This feature has proved to be more noisy than useful.  You can use --poll-progress-after, instead, to get more actionable feedback about potentially slow specs and understand where they might be getting stuck.

To silence deprecations that can be silenced set the following environment variable:
  ACK_GINKGO_DEPRECATIONS=2.13.0

PASS

Ginkgo ran 1 suite in 43.437464292s
Test Suite Passed

@carlory
Copy link
Member Author

carlory commented Dec 25, 2023

/test pull-kubernetes-e2e-kind-alpha-features
/test pull-kubernetes-e2e-kind-alpha-beta-features

@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: c57da35c693a4bbe117ddad321e1bf874939c4a4

@carlory
Copy link
Member Author

carlory commented Feb 28, 2024

/assign @deads2k @msau42

@carlory
Copy link
Member Author

carlory commented Mar 4, 2024

/cc @pohly

for e2e test review.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from a SIG Testing perspective.

@carlory
Copy link
Member Author

carlory commented Mar 5, 2024

/cc @xing-yang

@@ -195,6 +213,7 @@ func TestPersistentVolumeClaimEvaluatorMatchingResources(t *testing.T) {
"persistentvolumeclaims",
"gold.storageclass.storage.k8s.io/requests.storage",
"gold.storageclass.storage.k8s.io/persistentvolumeclaims",
"gold.volumeattributesclass.storage.k8s.io/persistentvolumeclaims",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add "gold.volumeattributesclass.storage.k8s.io/requests.storage"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xing-yang It is not supported. discussed in #118863 (comment)

@xing-yang
Copy link
Contributor

Looks good to me.

We need someone to review the quota changes: https://github.com/kubernetes/kubernetes/blob/master/pkg/quota/v1/OWNERS

@carlory
Copy link
Member Author

carlory commented Mar 6, 2024

/cc @deads2k @derekwaynecarr

@derekwaynecarr
Copy link
Member

The PR looks fine to me. The use case for quota on this concept is similar to quota on storage class. It appears @deads2k wants to discuss more generally in api-machinery meeting. Will defer to him for final approval based on outcome of that discussion.

@deads2k
Copy link
Contributor

deads2k commented Mar 22, 2024

Adding my comment from slack here.

I'd like to know things like

  1. if we go generic, what would a resourcename look like
  2. why choose a resourcename over a scope + scope selector
  3. if we created a generic mechanism, would it use resourcename parsing or a different mechanism

At first blush, a scoped count sounds pretty reasonable. But doing so would leverage a different name pattern. The agenda for apimachinery is pretty flexible if you'd like to come and discuss: https://docs.google.com/document/d/1x9RNaaysyO0gXHIr1y50QFbiL1x8OWnk2v3XnrdkT5Y/edit

@carlory
Copy link
Member Author

carlory commented Mar 25, 2024

  1. KEP: https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3751-volume-attributes-class#administrator-quota-restrictions

    apiVersion: v1
    kind: ResourceQuota
    metadata:
      name: glod-pvcs
    spec:
      hard:
        gold.volumeattributesclass.storage.k8s.io/persistentvolumeclaims: "10"

    why choose a resourcename over a scope + scope selector

    The current implementation is following the existing style in

    // that follows <storage-class>.storageclass.storage.k8s.io/<resource>.
    .

    I don't know the scope + scope selector mechanism is a generic mechanism for all resources. And the scope is only used for pods when I read the document. https://kubernetes.io/docs/concepts/policy/resource-quotas/#quota-scopes. I can not find other resources that use the scope mechanism.

  2. Suggestions from sig api-machinery @deads2k, according to comments (If I understand correctly):

    apiVersion: v1
    kind: ResourceQuota
    metadata:
      name: glod-pvcs
    spec:
      hard:
        count/persistentvolumeclaims: "10"
      scopeSelector:
        matchExpressions:
        - operator : In
          scopeName: VolumeAttributesClass # Match persistentvolumesclaims that references the specified volume attributes class.
          values: ["glod"]

    if we created a generic mechanism, would it use resourcename parsing or a different mechanism

    I don't know the generic mechanism when I implement this feature. If we change the implementation, does gold.storageclass.storage.k8s.io/persistentvolumeclaims also need to be changed?

    If so, we need defines some new ResourceQuotaScope types to match persistentvolumesclaims.

    IMO, If scope + scope selector is a generic mechanism for all resources, we can use it. Otherwise, we can use the current implementation.

    cc @sunnylovestiramisu @msau42 @xing-yang What do you think?

@carlory
Copy link
Member Author

carlory commented Mar 25, 2024

/hold

  • waiting for a feedback
  • Is it a bug? Users create unlimited volumes with special attributes by changing a PVC's vac into a non-existent vac. Those modified PVCs have the same attributes as the original PVCs. So, we should consider the status field of the PVC when computing the quota. cc @sunnylovestiramisu

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 25, 2024
@sunnylovestiramisu
Copy link
Contributor

Discussed offline, the quota system should be calculating current and the target resources, this is consistent with other existing quota settings.

For VAC, we have 3 fields spec.VACName, status.currentVACName, and status.modityVolumeStatus.targetVACName. So if any one of them is set to a VAC, the PVC needs to be calculated against the quota.

@sunnylovestiramisu
Copy link
Contributor

@carlory Discussed in the API machinery meeting, for the VAC use case, there is not much difference between using the scope or not from the users' perspective. However the scope implementation is better for the API machinery system itself because they have a lot of resources to track quotas. For the new resources they recommend to implement it with the scope:

resource: PVC
scope: field-selected
scopeselectorthing: .spec.volumeattributesclassname=gold

The extra work here is to add a new scope for volumeattributesclass.

@carlory
Copy link
Member Author

carlory commented Apr 18, 2024

@sunnylovestiramisu thanks!

I re-implement it in #124360

/close

@carlory carlory closed this Apr 18, 2024
@carlory
Copy link
Member Author

carlory commented Apr 22, 2024

The Quota scopes has a bug that affects the #124360 PR.

The new e2e tests added in #124360 are failing.

(base) (⎈|kind-kind:test)➜  kubernetes git:(kep-3751-quota-2) ✗ sudo FEATURE_GATES=VolumeAttributesClass=true ./hack/local-up-cluster.sh
(base) (⎈|kind-kind:test)➜  kubernetes git:(kep-3751-quota-2) ✗ ginkgo --focus "against a pvc with different volume attributes class." -v test/e2e -- --kubeconfig=/var/run/kubernetes/admin.kubeconfig --num-nodes=0 --allowed-not-ready-nodes=-1 --e2e-verify-service-account=false

Summarizing 1 Failure:
  [FAIL] [sig-api-machinery] ResourceQuota [Feature:VolumeAttributesClass] [FeatureGate:VolumeAttributesClass] [Alpha] [It] should verify ResourceQuota's volume attributes class scope (quota set to pvc count: 1) against a pvc with different volume attributes class. [sig-api-machinery, Feature:VolumeAttributesClass, FeatureGate:VolumeAttributesClass, Alpha]
  /Users/kiki/workspace/golang/src/k8s.io/kubernetes/test/e2e/apimachinery/resource_quota.go:1344

Ran 1 of 7200 Specs in 96.243 seconds
FAIL! -- 0 Passed | 1 Failed | 0 Pending | 7199 Skipped

We may have to use this PR to add quota support for PVC with VolumeAttributesClass, so I reopen it.

/cc @sunnylovestiramisu @msau42 @deads2k @derekwaynecarr

/reopen

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: carlory
Once this PR has been reviewed and has the lgtm label, please ask for approval from deads2k. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@carlory carlory marked this pull request as draft April 22, 2024 09:28
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 22, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1.30 - [kep-3751] Quota Change