[Flaking Test][sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run #122283

pacoxu · 2023-12-13T02:34:17Z

Failure cluster ea66cea699bee5bc4084

https://storage.googleapis.com/k8s-triage/index.html?test=validates%20resource%20limits%20of%20pods%20that%20are%20allowed%20to%20run

Error text:

[FAILED] context deadline exceeded
In [BeforeEach] at: test/e2e/framework/framework.go:263 @ 12/11/23 16:53:47.996

STEP: Starting Pods to consume most of the cluster CPU. - test/e2e/scheduling/predicates.go:379 @ 12/12/23 04:33:47.465
Dec 12 04:33:47.465: INFO: Creating a pod which consumes cpu=5530m on Node kind-worker
Dec 12 04:33:47.475: INFO: Creating a pod which consumes cpu=5530m on Node kind-worker2
E1212 04:33:48.368351   69964 retrywatcher.go:129] "Watch failed" err="context canceled"
E1212 04:33:49.368742   69964 retrywatcher.go:129] "Watch failed" err="context canceled"
Dec 12 04:33:49.513: INFO: Failed inside E2E framework:
    k8s.io/kubernetes/test/e2e/framework/pod.WaitTimeoutForPodRunningInNamespace({0x7fd3ec76a0f0, 0xc004185200}, {0x78e4b70?, 0xc003da7380?}, {0xc0045b2480, 0x2f}, {0xc0039b0de0, 0xf}, 0x0?)
    	test/e2e/framework/pod/wait.go:459 +0x2ed
    k8s.io/kubernetes/test/e2e/framework/pod.WaitForPodRunningInNamespace(...)
    	test/e2e/framework/pod/wait.go:468
    k8s.io/kubernetes/test/e2e/scheduling.glob..func4.5({0x7fd3ec76a0f0, 0xc004185200})
    	test/e2e/scheduling/predicates.go:416 +0xe6c
STEP: removing the label node off the node kind-worker - test/e2e/framework/node/helper.go:73 @ 12/12/23 04:33:49.514
STEP: verifying the node doesn't have the label node - test/e2e/framework/node/helper.go:76 @ 12/12/23 04:33:49.544
STEP: removing the label node off the node kind-worker2 - test/e2e/framework/node/helper.go:73 @ 12/12/23 04:33:49.547
STEP: verifying the node doesn't have the label node - test/e2e/framework/node/helper.go:76 @ 12/12/23 04:33:49.565
[FAILED] Told to stop trying after 2.009s.
Expected pod to reach phase "Running", got final phase "Failed" instead.
In [It] at: test/e2e/scheduling/predicates.go:416 @ 12/12/23 04:33:49.568
< Exit [It] validates resource limits of pods that are allowed to run [Conformance] - test/e2e/scheduling/predicates.go:334 @ 12/12/23 04:33:49.568 (2.204s)
> Enter [AfterEach] [sig-scheduling] SchedulerPredicates [Serial] - test/e2e/scheduling/predicates.go:91 @ 12/12/23 04:33:49.568
< Exit [AfterEach] [sig-scheduling] SchedulerPredicates [Serial] - test/e2e/scheduling/predicates.go:91 @ 12/12/23 04:33:49.568 (0s)
> Enter [DeferCleanup (Each)] [sig-scheduling] SchedulerPredicates [Serial] - test/e2e/framework/node/init/init.go:34 @ 12/12/23 04:33:49.568
Dec 12 04:33:49.568: INFO: Waiting up to 7m0s for all (but 0) nodes to be ready
< Exit [DeferCleanup (Each)] [sig-scheduling] SchedulerPredicates [Serial] - test/e2e/framework/node/init/init.go:34 @ 12/12/23 04:33:49.573 (5ms)
> Enter [DeferCleanup (Each)] [sig-scheduling] SchedulerPredicates [Serial] - test/e2e/framework/metrics/init/init.go:35 @ 12/12/23 04:33:49.573
< Exit [DeferCleanup (Each)] [sig-scheduling] SchedulerPredicates [Serial] - test/e2e/framework/metrics/init/init.go:35 @ 12/12/23 04:33:49.573 (0s)
> Enter [DeferCleanup (Each)] [sig-scheduling] SchedulerPredicates [Serial] - dump namespaces | framework.go:218 @ 12/12/23 04:33:49.573
STEP: dump namespace information after failure - test/e2e/framework/framework.go:297 @ 12/12/23 04:33:49.573
STEP: Collecting events from namespace "sched-pred-8301". - test/e2e/framework/debug/dump.go:42 @ 12/12/23 04:33:49.573
STEP: Found 3 events. - test/e2e/framework/debug/dump.go:46 @ 12/12/23 04:33:49.577
Dec 12 04:33:49.577: INFO: At 2023-12-12 04:33:47 +0000 UTC - event for filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3: {default-scheduler } Scheduled: Successfully assigned sched-pred-8301/filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3 to kind-worker2
Dec 12 04:33:49.577: INFO: At 2023-12-12 04:33:47 +0000 UTC - event for filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0: {default-scheduler } Scheduled: Successfully assigned sched-pred-8301/filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0 to kind-worker
Dec 12 04:33:49.577: INFO: At 2023-12-12 04:33:47 +0000 UTC - event for filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0: {kubelet kind-worker} NodeAffinity: Predicate NodeAffinity failed
Dec 12 04:33:49.581: INFO: POD                                              NODE          PHASE    GRACE  CONDITIONS
Dec 12 04:33:49.581: INFO: filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3  kind-worker2  Pending         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2023-12-12 04:33:47 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2023-12-12 04:33:47 +0000 UTC ContainersNotReady containers with unready status: [filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3]} {ContainersReady False 0001-01-01 00:00:00 +0000 UTC 2023-12-12 04:33:47 +0000 UTC ContainersNotReady containers with unready status: [filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2023-12-12 04:33:47 +0000 UTC  }]
Dec 12 04:33:49.581: INFO: filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0  kind-worker   Failed          []
Dec 12 04:33:49.581: INFO: 
Dec 12 04:33:49.610: INFO: Unable to fetch sched-pred-8301/filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3/filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3 logs: the server rejected our request for an unknown reason (get pods filler-pod-36f468c7-08c4-4dd6-b52f-a41567d1b7f3)
Dec 12 04:33:49.661: INFO: Unable to fetch sched-pred-8301/filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0/filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0 logs: the server rejected our request for an unknown reason (get pods filler-pod-c2ed7ea0-735c-44d4-884b-a37f7bf19cf0)
Dec 12 04:33:49.665: INFO: 
Logging node info for node kind-control-plane

NodeAffinity: Predicate NodeAffinity failed

Recent failures:

2023/12/9 21:48:25 e2e-ci-kubernetes-e2e-al2023-aws-conformance-canary
2023/12/2 17:48:19 e2e-ci-kubernetes-e2e-al2023-aws-conformance-canary

/kind flake
/sig scheduling

See it in https://testgrid.k8s.io/sig-release-master-blocking#conformance-ga-only.

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2023-12-13T02:34:25Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sanposhiho · 2023-12-13T02:53:12Z

I’ve got little time investigating deeply, but I found one critical bug in NodeAffinity QueueingHint, which is implemented in this release. Failing test may be caused by that.
#122284

pacoxu · 2023-12-13T03:32:25Z

/cc @kubernetes/ci-signal

pacoxu · 2023-12-13T03:37:21Z

/priority critical-urgent

pacoxu · 2023-12-13T05:43:33Z

/remove-priority critical-urgent

as we disabled QueueingHint by default.

pacoxu · 2024-02-19T02:13:30Z

/close
as no flaking can be found with https://storage.googleapis.com/k8s-triage/index.html?test=validates%20resource%20limits%20of%20pods%20that%20are%20allowed%20to%20run

k8s-ci-robot · 2024-02-19T02:13:35Z

@pacoxu: Closing this issue.

In response to this:

/close
as no flaking can be found with https://storage.googleapis.com/k8s-triage/index.html?test=validates%20resource%20limits%20of%20pods%20that%20are%20allowed%20to%20run

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added kind/flake Categorizes issue or PR as related to a flaky test. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Dec 13, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Dec 13, 2023

k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Dec 13, 2023

k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Dec 13, 2023

k8s-ci-robot closed this as completed Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flaking Test][sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run #122283

[Flaking Test][sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run #122283

pacoxu commented Dec 13, 2023

k8s-ci-robot commented Dec 13, 2023

sanposhiho commented Dec 13, 2023

pacoxu commented Dec 13, 2023

pacoxu commented Dec 13, 2023

pacoxu commented Dec 13, 2023

pacoxu commented Feb 19, 2024

k8s-ci-robot commented Feb 19, 2024

[Flaking Test][sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run #122283

[Flaking Test][sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run #122283

Comments

pacoxu commented Dec 13, 2023

Failure cluster ea66cea699bee5bc4084

Error text:

Recent failures:

k8s-ci-robot commented Dec 13, 2023

sanposhiho commented Dec 13, 2023

pacoxu commented Dec 13, 2023

pacoxu commented Dec 13, 2023

pacoxu commented Dec 13, 2023

pacoxu commented Feb 19, 2024

k8s-ci-robot commented Feb 19, 2024