Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LinuxOnly] is egregiously misused at least in sig-network tests #124426

Open
danwinship opened this issue Apr 21, 2024 · 4 comments
Open

[LinuxOnly] is egregiously misused at least in sig-network tests #124426

danwinship opened this issue Apr 21, 2024 · 4 comments
Assignees
Labels
sig/network Categorizes an issue or PR as relevant to SIG Network. sig/windows Categorizes an issue or PR as relevant to SIG Windows. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@danwinship
Copy link
Contributor

In test/e2e/network, there are currently 45 tests marked [LinuxOnly], of which it seems that 37 are incorrect or at least dubious:

The seemingly-correct ones:

  • 5 use SCTP, which AIUI the Windows kernel does not implement and does not plan to implement.
  • 3 test partial DNS names (kubernetes.default.svc) which apparently are not supported by the Windows resolver

cc @claudiubelu @knabben @MikeZappa87
/sig windows
/sig network

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. sig/network Categorizes an issue or PR as relevant to SIG Network. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 21, 2024
@knabben
Copy link
Member

knabben commented Apr 23, 2024

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 23, 2024
@claudiubelu
Copy link
Contributor

Hello,

Indeed, there are quite a few tests which are marked as [LinuxOnly], and since then there were some tests which were fixed and / or had this label removed over time, not necessarely all networking-related [1][2][3][4][5][6] (may be others as well). But out of those 45 tests you mentioned, not all of them are marked as [Conformance], or at least as far as I can tell, I see only 8:

ubuntu@ubuntu:~/workdir/kubernetes$ yq '.[] | .codename | select(contains("LinuxOnly")) | select(contains("network"))' test/conformance/testdata/conformance.yaml
[sig-network] DNS should resolve DNS of partial qualified names for services [LinuxOnly] [Conformance]
[sig-network] HostPort validates that there is no conflict between pods with same hostPort but different hostIP and protocol [LinuxOnly] [Conformance]
[sig-network] Networking Granular Checks: Pods should function for node-pod communication: http [LinuxOnly] [NodeConformance] [Conformance]
[sig-network] Networking Granular Checks: Pods should function for node-pod communication: udp [LinuxOnly] [NodeConformance] [Conformance]
[sig-network] Services should be able to switch session affinity for NodePort service [LinuxOnly] [Conformance]
[sig-network] Services should be able to switch session affinity for service with type clusterIP [LinuxOnly] [Conformance]
[sig-network] Services should have session affinity work for NodePort service [LinuxOnly] [Conformance]
[sig-network] Services should have session affinity work for service with type clusterIP [LinuxOnly] [Conformance]

Also worth mentioning that there may be more tests which are being skipped on Windows through something like SkipIfNodeOSDistroIs("windows"), I'm not sure if they're included in your count or not. I remember that we were not allowed to add skips to [Conformance] tests, which is why that label was necessary in the first place. This label was not typically added to non-conformance tests, but IMO, it's more useful to have it as a label; it makes it easier to grep which tests are supposed to be Linux-only or not. As for the number of Linux-only networking tests, a more complete list of tests could be useful, so we can check whether or not they're mislabeled or not, or which tests exactly we're talking about.

Also note that non-Conformance tests are not typically tested as the Conformance ones, so I can't say much about them passing or not. There were some efforts in the past for promoting more networking tests to Conformance [7], but that effort seems discontinued at this point.

Now, regarding the tests you've mentioned:

  • SessionAffinity: Probably requires more in-depth investigation. But from a quick look, it doesn't work [8] (marked all networking tests as [Conformance] so they'd be run by the Windows CI):
Kubernetes e2e suite: [It] [sig-network] Services should have session affinity work for NodePort service [Conformance] 2m47s
{ failed [FAILED] Affinity should hold but didn't.
In [It] at: k8s.io/kubernetes/test/e2e/network/service.go:266 @ 04/22/24 19:31:21.999
}

Kubernetes e2e suite: [It] [sig-network] Networking Granular Checks: Services should function for client IP based session affinity: udp [Conformance]   1m6s
{ failed [FAILED] Unexpected endpoints return: map[netserver-0:{} netserver-1:{}], expect 1 endpoints
In [It] at: k8s.io/kubernetes/test/e2e/network/networking.go:447 @ 04/22/24 19:34:02.49
}
  • dual-stack: Indeed, according to James [9], these tests could be enabled for WS 2022 jobs but not for WS 2019 jobs. I assume there should be a dedicated dual-stack WS 2022 job for this, which would also require e2e to receive a dual-stack-related config flag. From what I can see, they're labeled as [Feature:IPv6DualStack], so we can use that label to exclude it from other jobs.

  • UDP support in agnhost: not sure which tests are those. should function for node-pod communication: udp?

  • hostNetwork tests: agreed. Though, I thought there were more than 2 tests.

  • hostPorts test seems to be failing on Windows [7], may require further investigation:

Kubernetes e2e suite: [It] [sig-network] HostPort validates that there is no conflict between pods with same hostPort but different hostIP and protocol [Conformance]   45s
{ failed [FAILED] Failed to connect to exposed host ports
In [It] at: k8s.io/kubernetes/test/e2e/network/hostport.go:161 @ 04/22/24 19:29:22.484
}
  • IPv6 test: I assume you refer to should provide Internet connection for containers? It might require a bit more investigation. It seems that there's a failure [10]: forward host lookup failed: h_errno 11001: HOST_NOT_FOUND

  • SCTP: indeed, it seems that Windows doesn't have it [11]. But there seem to be some 3rd party implementations. WDYT, should we consider one of those and say it's officially supported on Kubernetes for Windows?

  • partial DNS names: yes, that is correct. You may either use only the FQDNs, or only the hostname part.

[1] #101063
[2] #72729
[3] #97045
[4] #85453
[5] #78731
[6] #75591

[7] #73425
[8] https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/124447/pull-kubernetes-e2e-capz-windows-master/1782468233807269888
[9] #100870 (comment)
[10] https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/124447/pull-kubernetes-e2e-capz-windows-master/1782725964745150464
[11] https://learn.microsoft.com/en-us/answers/questions/778329/sctp-driver

@danwinship
Copy link
Contributor Author

But out of those 45 tests you mentioned, not all of them are marked as [Conformance]

Sure, but they all involve documented features that users might want to use. And as you said, we do eventually want to move more tests toward Conformance, so Windows really ought to be aiming to pass everything that isn't truly linux-specific.

Also worth mentioning that there may be more tests which are being skipped on Windows through something like SkipIfNodeOSDistroIs("windows"), I'm not sure if they're included in your count or not.

No... I was only looking at [LinuxOnly].

  • UDP support in agnhost: not sure which tests are those. should function for node-pod communication: udp?

No, sorry, I should have listed these out. The NetworkPolicy between server and client using UDP tests in test/e2e/network/netpol/network_policy.go. (In fact, they are both [LinuxOnly] and SkipIfNodeOSDistroIs("windows")!)

  • hostNetwork tests: agreed. Though, I thought there were more than 2 tests.

Perhaps the others are skipped rather than labeled.

  • IPv6 test: I assume you refer to should provide Internet connection for containers?

Yes

  • SCTP: indeed, it seems that Windows doesn't have it [11]. But there seem to be some 3rd party implementations. WDYT, should we consider one of those and say it's officially supported on Kubernetes for Windows?

SCTP isn't even implemented by most Linux network plugins and it seems like sig-windows already has too much to do anyway, so it seems reasonable to just say that it's not supported on Windows. If someone wants to get it to be officially-supported then they can do the work...

@sebsoto
Copy link

sebsoto commented Apr 30, 2024

/assign @sebsoto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/network Categorizes an issue or PR as relevant to SIG Network. sig/windows Categorizes an issue or PR as relevant to SIG Windows. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: No status
Development

No branches or pull requests

5 participants