Skip to content

Galaxy102/podman-el-network-issue-showcase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Podman Enterprise Linux Network Error Showcase

I had a random connection issue in Podman containers which affected only Enterprise Linux installations. Sometimes, I was unable to connect to another totally unrelated host (even google.com). Upon further inspection I could see broken TCP communication while handshaking: SYN, SYN-ACK, but no ACK:
clipboard-202404172006-zrqti

This issue only affects communication under the following circumstances:

  • You use Enterprise Linux (Alma, Rocky, RHEL)
  • You have two bridge networks, net0 and net1 (name arbitrary)
  • You have one container on net0 which publishes ports
  • You have a second container on net0 and net1
  • The issue occurs for any outbound communication that leaves on the network interface corresponding to net0 (this is a random component - sometime the packets leave on one interface, sometimes on the other one!)

This scenario can be set up with vagrant up rocky93. The other VMs are included for reference/testing purposes, I can confirm that openSUSE, Fedora and Ubuntu work as-is.

After some time you should see the result of the test stage:
image

It is clear that communication works on one interface, but not the other one. Because of containers/podman#12850, the affected interface name may differ.

After comparing the sysctls of Rocky 9.3 and Ubuntu 23.10, I was able to isolate a few interesting differences. After a bit of trial-and-error I was able to boil it down to net.ipv4.conf.default.rp_filter, which is set to 1 in Enterprise Linux and 2 in Ubuntu. This sysctl basically tells the kernel to drop any communication for which the packet's path is suboptimal (as far as I understood RFC3704 from a few minutes of reading) if 1 is set or allow it if by any chance this packet may be legitimate if 2 is set. The value 0 disables any path checks.

Now if you uncomment line 4 of the VM setup script and run vagrant provision rocky93, you will see everything works as expected:
image

About

Showcase for a network issue when using Podman in Enterprise Linux

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages