Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.15 Backports 2024-05-06 #32384

Merged
merged 20 commits into from
May 8, 2024
Merged

v1.15 Backports 2024-05-06 #32384

merged 20 commits into from
May 8, 2024

Conversation

pippolo84
Copy link
Member

@pippolo84 pippolo84 commented May 6, 2024

julianwiedmann and others added 20 commits May 6, 2024 16:43
[ upstream commit feaf6c7 ]

inherit_identity_from_host() already knows how to handle
MARK_MAGIC_PROXY_EGRESS_EPID and extract the endpoint ID from the mark. Use
it to condense the code path, similar to how to-container looks like.

Also fix up the drop notification, which currently uses the endpoint ID in
place of the source security identity.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 446cf56 ]

Use a single send_trace_notify() statement, with parameters that can be
trivially optimized out in the from-netdev path.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 58b74f5 ]

The blamed commit anticipated the execution of the watchdog in charge of
restarting the etcd connection to a remote cluster in case of errors.
However, this can lead to a panic if the etcd client cannot be created
(e.g., due to an invalid config file), as in that case the returned
backend is nil, and the errors channel cannot be accessed.

Let's push again this logic below the error check, to make sure that
the backend is always valid at that point. Yet, let's still watch
for possible reconnections during the initial connection establishment
phase, so that we immediately restart it in case of issues. Otherwise,
this phase may hang due to the interceptor preventing the establishment
to succeed, given that it would continue returning an error.

Fixes: 174e721 ("ClusterMesh: validate etcd cluster ID")
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 2136418 ]

[ backporter's notes: values.schema.json has been introduced in #30631
and thus is not present in v1.15. Changes to that file have been
ignored. ]

Starting from k8s 1.30 together with Ubuntu 24.04, Cilium fails to
initialize with the error:

```
Error: applying apparmor profile to container 43ed6b4ba299559e8eac46a32f3246d9c54aca71a9b460576828b662147558fa: empty localhost AppArmor profile is forbidden
```

This commit adds the "Unconfined" as default, where users can overwrite
it with any of the AppArmor profiles available on their environments, to
all the pods that have the "container.apparmor.security.beta.kubernetes.io"
annotations.

Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 2949b16 ]

Relates: #19764
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 8397e45 ]

Unlike every other identity, the set of labels for the reserved:host
identity is mutable. That means that rules should not cache matches for
this identity.

So, clean up the code around determining matches.

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit e7db879 ]

Context: During IPsec upgrades, we may have to temporarily remove some
XFRM states due to conflicts with the new states and because the Linux
API doesn't enable us to perform this atomically as we do for XFRM
policies.

This commit moves this removal logic to its own function. That logic
will grow in subsequent commits as I'll add debugging information to the
log message.

This commit doesn't make any functional changes.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit bba016e ]

Context: During IPsec upgrades, we may have to temporarily remove some
XFRM states due to conflicts with the new states and because the Linux
API doesn't enable us to perform this atomically as we do for XFRM
policies.

This temporary removal should be very short but can still cause drops
under heavy throughput. This commit logs the duration of the removal so
we can validate that it's actually always short and estimate the impact
on packet drops.

Note the log message will now be displayed only once the XFRM state is
re-added, instead of when it's removed like before.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 76d6670 ]

Context: During IPsec upgrades, we may have to temporarily remove some
XFRM states due to conflicts with the new states and because the Linux
API doesn't enable us to perform this atomically as we do for XFRM
policies.

This temporary removal should be very short but can still cause drops
under heavy throughput. This commit logs how many such drops happened.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit dbcdd7d ]

Whenever AKS stopped supporting a particular version of AKS, we had to
manually remove it from all stable branches. Now instead of that, we
will dynamically check if it's supported and only then run the test.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 29a340e ]

This commit corrects the MTU that is used by the cilium-cni plugin when
creating routes for CIDRs received from ENI, Azure or Alibaba Cloud.

The cilium-agent daemon returns two MTUs to the cilium-cni plugin: a
"device" MTU, which is used to set the MTU on a Pod's interface in
its network namespace, and a "route" MTU, which is used to set the MTU
on the routes created inside the Pod's network namespace that handle
traffic leaving the Pod. The "route" MTU is adjusted based on the Cilium
configuration to account for any configured encapsulation protocols,
such as VXLAN or WireGuard. Before this commit, when ENI, Azure or Alibaba
Cloud IPAM was enabled, the routes created in a Pod's network namespace
were using the "device" MTU, rather than the "route" MTU, leading to
fragmentation issues.

Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit f1925b5 ]

There is no reason why the log level of "Timed out waiting for datapath
updates of FQDN IP information" log message should be an error. Change it
to a warning instead.

Add a reference to --tofqdns-proxy-response-max-delay parameter to make
this warning actionable.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
…ages

[ upstream commit b1fae23 ]

When tail_nodeport_rev_dnat_ingress_ipv*() is called by from-container to
apply RevDNAT for a local backend's reply traffic, it drops all unhandled
packets. This would mostly be traffic where the pod-level CT entry was
flagged as .node_port = 1, but there is no corresponding nodeport-level
CT entry to provide a rev_nat_index for RevDNAT. Essentially an unexpected
error situation.

But we didn't consider that from-container might also see ICMP error
messages by a service backend, and the CT_RELATED entry for such traffic
*also* has the .node_port flag set. As we don't have RevDNAT support for
such traffic, there's no good reason to send it down the RevDNAT tailcall.
Let it continue in the normal from-container flow instead. This avoids
subsequent drops with DROP_NAT_NO_MAPPING.

Alternative solutions would be to tolerate ICMP traffic in
tail_nodeport_rev_dnat_ingress_ipv*(), or not propagate the .node_port flag
into CT_RELATED entries.

Fixes: 6936db5 ("bpf: nodeport: drop reply by local backend if revDNAT is skipped")
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit d74bc1c ]

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 01c3b83 ]

[ backporter's notes: The workflow looks different in v1.15, the option
to bump the timeout has been applied to the wait for cluster mesh to be
ready after rolling out Cilium in cluster2. ]

The KPR matrix entry recently started failing quite frequently due to
cilium status --wait failing to complete within the timeout after the
upgrading Cilium to the tip of main. Part of the cause seems to be
cilium/test-connection-disruption@619be5ab79a6, which made the connection
disruption test more aggressive. In turn, causing more CPU contention
during endpoint regeneration, which now requires longer. Hence, let's
bump the timeout, to avoid failing too early due to the endpoint
regeneration not having terminated yet.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit a682a62 ]

Users of this library need Cilium to both check
restore and updated DNS rules for the new PortProto
version. Otherwise upgrade incompatibilities exist
between Cilium and programs that utilize this library.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 0e07aa5 ]

The documentation for the Host Firewall contains a number of invocations
of the cilium-dbg command-line tool, to run on a Cilium pod. For
convenience, the full, standalone command starting with "kubectl -n
<namespace> exec <podname> -- cilium-dbg ..." is provided everytime. But
this makes for long commands that are hard to read, and that sometimes
require scrolling on the rendered HTML version of the guide.

Let's shorten them by adding an alias for the recurrent "kubectl exec"
portion of the commands.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 21d5f01 ]

This commit is a mix of various minor improvements for the host firewall
guide and the host policies documentation:

- Fix some commands in troubleshooting steps:
    - "cilium-dbg policy get" does not show host policies
    - "kubectl get nodes -o wide" does not show labels
    - Add the policy audit mode check

- (Attempt to) improve style

- Mark references with the ":ref:" role in a consistent way

- Improve cross-referencing (in particular, the text displayed for the
  K8s label selector references was erroneous)

- Strip trailing whitespaces

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 642921d ]

Make sure that users of the Host Firewall are aware of the current
limitations, or rather the known issues, for the feature.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 5dd2f14 ]

Help users reading the Host Firewall tutorial to find the sections
related to troubleshooting host policies or listing known issues for the
Host Firewall.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 added kind/backports This PR provides functionality previously merged into master. backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. labels May 6, 2024
@pippolo84
Copy link
Member Author

/test-backport-1.15

@pippolo84 pippolo84 marked this pull request as ready for review May 6, 2024 16:06
@pippolo84 pippolo84 requested review from a team as code owners May 6, 2024 16:06
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR looks good. Thanks!

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Good, looks my PR.

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My commits look good, thanks!

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for my commit and thanks a lot ✔️

Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Member

@nebril nebril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed #32199 backport, LGTM

@julianwiedmann julianwiedmann added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 8, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 8, 2024
@ldelossa ldelossa merged commit 8e96a3d into v1.15 May 8, 2024
231 checks passed
@ldelossa ldelossa deleted the pr/v1.15-backport-2024-05-06-04-43 branch May 8, 2024 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet