-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPv6_rpfilter=yes breaks IPv6 connectivity over bridges when br_netfilter is in use #1235
Comments
Hrm. I don't think this unexpected. Loading I'm guessing the difference is sysctl I'm not sure we can do anything about this on the firewalld side. |
There is no However, while IPv4 rp_filter=1 does not implicitly affect bridge traffic, the firewalld version of the rpfilter/fib rule does, and bridge forwarding is exactly where routing-table-based reverse path filtering does not make sense. |
Right.
It affects IPv6, yes.
Agree. You're also explicitly asking for something that "does not make sense", i.e. layer3 filtering bridged packets. As a consequence of this, you also get hit with the IPv6 reverse path filtering checks. There is no way to opt into one, but not the other; you get both. Your options are:
As you said, there is no solution to this with the nftables backend. That's largely because nftables has true |
Can you clarify? I think you're suggesting the
Upper layer firewalling on bridge traffic is not something that "does not make sense". This is how "transparent firewalls", such as libvirt nwfilter, are implemented. It is specifically the reverse path filtering that does not make sense to apply to bridge traffic.
Please note that the direct can work with
I didn't say that there was "no solution" to this problem, nor do I believe that there is actually "no solution". For example, would you be open to a patch implementing the same workaround as a built-in option, but with user-configurable marks/masks? However, if firewalld "won't" rely on iptables-to-nftables wrappers or marks, then I suppose this issue "should" be brought to the attention of upstream nftables/kernel developers in pursuit of a 100% nftables-native and mark-free solution. |
I am not suggesting that. I'm saying yes to "firewalld version of rpfilter affects bridged traffic if using br_netfilter". But I'm also saying that I think that is expected and that you'll have to set
Okay. Then you can disable it.
Right that it would work with That works regardless if
No to using marks. Firewalld deliberately avoids using them to not conflict with users and other entities.
nftables already has a solution. They'll say to use the |
Alright, I thought that might be the case. It's too bad that
Good point. That is a nice improvement over original the original iptables+br_netfilter way of doing it. |
This exact same issue bit me, too, and it cost me an hour to find this thread here and to find the solution. It would have cost me even more time had this issue not been opened. I personally also think that this is a departure from the IPv4 behavior; GigabyteProductions exactly describes the issue (packets in IPv4 with rp_filter=1 pass the bridge without issues, but the same packets with IPv6 do not pass firewalld's RP filter).
There's also no way from |
It's a departure because the ipv4 rp_filter sysctl is implemented in the IP stack; not netfilter. firewalld's So when something loads
No. It's not exposed in the CLI; only in |
IMO, the only "user friendly" thing we can do here is have firewalld detect if |
I faced critical issues with docker + firewalld, every packet dropped during a docker pull with kernel message |
You can check by seeing if |
I'm using latest docker on debian 12. Yes the module is loaded
|
If it is loaded, and you haven't taken any special steps to set /proc/sys/net/bridge/bridge-nf-call-*tables to 0, then your bridge traffic is going through "ip6tables"/nftables as if it is being routed, which is why the IPv6 rpfilter rule is affecting your bridge traffic. Disabling the rpfilter rule is one approach, but can affect network security. If that is unacceptable, you may want to consider the mark-oriented direct rules shown earlier in this thread. If you're not concerned about the Docker containers communicating with the Docker host, another approach may be to ensure that the bridge in question itself has an address in each IPv6 subnet that you're using, so the frames pass the reverse path checks imposed by the rpfilter rule. |
I forgot to mention that I only assumed the module was loaded because its functionality is required for your configuration. If this is not the case, preventing the module from loading, or disabling its functionality through those sysctls will also avoid the issue. |
I haven't changed config regarding bridge traffic, nor to activate the netfilter module. IPv6 is enabled on the host, but actually my docker app (nextcloud) isn't using it. I'm unsure if I should completely disable ipv6 on the host instead of disabling rpfilter, I wanted to try, but since it's still unstable in nextcloud (and experimental in docker), may be that's the safest option. |
I searched in /etc/modules* and /lib/modules* load directories, I could not find anything setting related to the netfilter mod. I don't understand which config file docker is using (if it's indeed due to docker) to load this kernel module. I struggle to understand all the security implications of disabling IPv6_rpfilter in /etc/firewalld/firewalld.conf , could someone clarify this please ? |
Here is an article demonstrating the security issue of disabling reverse path filtering, from an IPv4 point of view: https://www.theurbanpenguin.com/rp_filter-and-lpic-3-linux-security/ Without reverse path filtering, a host can receive a packet with a spoofed source address on an untrusted network, and respond to it on the trusted network. This isn't to say that it is easy to establish a bidirectional TCP connection from outside a trusted network, but it may expose things like internal DNS servers to attacks from the outside. The module will be loaded automatically by certain ip/ip6tables rules being added. Not sure if that's what's happening on Docker I don't have a lot of Docker experience, myself. Are you using a bridge network to connect containers directly to the host network? Last time I touched it, the out-of-the-box behavior was for the Docker host to setup a bridge, assign its own internal IP addresses, attached containers to it with veth pairs, and the purpose of the bridge was actually for routing between the internal Docker network and the host network. The only reason I can think for Docker to load br_netfilter in this configuration is to firewall the containers from each other, because br_netfilter is unnecessary in the context of routing alone. Can you clarify if your problem that containers can't talk to each other, or can't talk to the IPv6 Internet? Personally, I think disabling IPv6 on the host machine is also overkill, but I can see how it may be a simpler solution for your use case than learning about how IPv6 works. Also note that, according to this documentation, you probably have to restart docker to reload your network configuration. I'm guessing that if containers aren't restarted, they'll retain the IPv6 configuration they already had: https://github.com/nextcloud/all-in-one/blob/main/docker-ipv6-support.md Ultimately, if turning off IPv6_rpfilter in firewalld restores IPv6 connectivity, then you must have IPv6 frames being switched from one bridge slave to another, on a bridge that doesn't have an IP address in the same subnet. |
If you'd like some help identifying the details of the docker networking, exactly how firewalld is involved in the issue, and aren't afraid to show internal IP/MAC addresses for your system, please paste the output of |
Thanks for the explanation. Then regarding the app, it's strange because, whereas I saw DNS ipv6 queries errors in the log (but I guess it fallback to ipv4), everything worked. I must say I found this docker + firewall stuff to be a nightmare, first I discovered docker + ufw was simply doing nothing, then switched to firewalld, trying to follow every docs and tutorials, using a default / recommended config and everything, but obviously something is wrong, and I can't tell where. Docker docs does not really help. |
I don't really feel comfortable to post all those addresses here, but thanks for the help. |
Well after re-installing firewalld on my server I found out some interesting things. I still believe this is also a docker issue, but it may be linked to this one here. I'll try to keep it short :
So it's strange the kernel is dropping hundreds of packets when I use docker pull, while it's apparently working as expected (if you don't log denied packets).
Those are ipv6 addresses, but I also have other rejected packets on some ipv4 ones, port 5355. After disabling the firewalld log again, I can see more clearly kernel messages that appear while using docker pull :
I searched for this warning and found out the default route cache size (4096) isn't enough for the modern web. But it doesn't look like a big problem, except for performance. FIY this bare metal server is running debian 12, with firewalld 1.3.3 and docker 25. Is it possible that something is wrong in my ipv6 config, or is it due to this issue and a change since firewalld 1.3 ? Edit: br_netfilter is also loaded on the ubuntu server |
Here is one of the stack trace of the system failure that appears in the logs after the eth interface has been reset.
|
This line stands out to me:
Flooding the kernel log buffer is a great way to prevent the kernel from doing other things, and you're doing exactly that with the That being said, I didn't look into your "Route cache is full" error until now because I assumed that was unrelated to hanging IPv6 connections. It looks like you should only have IPv6 route cache entries due to PMTU exceptions [1]. This issue is unrelated to the |
Background:
IPv4 rpfilter happens in the kernel IP stack. The Linux kernel doesn't do rpfilter in the IPv6 stack at all, so the IPv6 rpfilter has to be implemented as a firewall rule dropping the problematic traffic prior to reaching the host IPv6 stack, firewalld implements this as an ip6tables
-A PREROUTING -m rpfilter --invert -j DROP
rule, or an nftmeta nfproto ipv6 fib saddr . mark . iif oif missing drop
rule.Ethernet bridges under Linux do not normally (or at least by default) operate as transparent firewalls. In order for bridge IPv6 traffic to be filtered by the kernel, the
br_netfilter
module must be loaded, and either the system-wide/proc/sys/net/bridge/bridge-nf-call-ip6tables
tunable must be set to1
OR the interface-specific/sys/devices/virtual/net/${interface_name}/bridge/nf_call_iptables
setting must be1
. Right now, the kernel's system-wide tunable is set to1
upon loadingbr_netfilter
.The problem:
The above background creates a difference in functionality between IPv4 and IPv6 crossing a bridge.
For IPv4, reverse path filtering is only relevant for traffic directed at the host kernel (frames destined for the bridge interface's MAC address). IPv4 hosts can communicate with each other over the bridge as long as their packets aren't dropped in the filter/FORWARD chain. Multiple subnets may be used without requiring the bridge host to have prior knowledge of each subnet.
For IPv6, the firewall rule
-m rpfilter --invert -j DROP
(or its nft equivalent) matches all IPv6 traffic not matched by a route on the bridge host, which happens to be ALL traffic when the bridge interface is intentionally not assigned an IP address. For example, this completely breaks IPv6 between two VMs communicating over a libvirt "isolated" network, despite libvirt adding a-A LIBVIRT_FWX -i virbr0 -o virbr0 -j ACCEPT
rule.Note that
br_netfilter
is loaded by libvirt if any libvirt nwfilter is in use. Also note that a completely working IPv6 network will immediately come to a halt on a host if nwfilter is added for the first time to an existing libvirt host, and the bridge interface on the host doesn't happen to have an IPv6 address in the subnets in use (this is how we discovered this bug...)Discussion:
Of course, we could disable the rpfilter rule all together by setting
IPv6_rpfilter=no
, but this is not an acceptable answer as this has negative security implications in other contexts.I was hoping to modify the rule to add
-m physdev ! --physdev-is-bridged
to distinguish between frames that would cross the bridge and packets that would be forwarded between subnets, but--physdev-is-bridged
doesn't know the difference in PREROUTING, and-m rpfilter
only works in PREROUTING. Furthermore, while I don't know if this same limitation applies to nft's equivalentfib
rule, nft has no native equivlanet for physdev checks, requiring the iptablesrpfilter
version, anyway...I'm also guessing firewalld wants to stay away from utilizing marks, since there's no telling how else they may already be used or matched by the system. That being said, I am able to workaround the issue to get behavior similar to IPv4's rp_filter=1 using
IPv6_rpfilter=no
and the following firewalld direct rules:Here's the permanent
/etc/firewalld/direct.xml
representation:Here's the relevant part of
nft list ruleset
for the above:Even though the above workaround is written in iptables format, it works with both
FirewallBackend=iptables
andFirewallBackend=nftables
(at least on Rocky Linux 8).The text was updated successfully, but these errors were encountered: