Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf_redirect_map/bpf_redirect performance using generic xdp #60

Open
JustusvonderBeek opened this issue Mar 1, 2021 · 7 comments
Open

Comments

@JustusvonderBeek
Copy link

I am currently implementing a program that modifies packets in the egress direction using XDP (in the generic mode because the interfaces do not support driver mode). Therefore I send packets into a virtual interface and redirect these packets towards the ingress direction (using eBPF TC) of the interface I want the packets to modify on (see image below). To then transmit the packets, the XDP program redirects those modified packets back on the same interface in the egress direction. I tested both bpf_map_redirect and bpf_redirect for this second redirect. I know that in my case it is probably easier to use eBPF TC for this modification but I found an issue with this setup. The setup looks like the following:

PerformanceEgress

The first redirect (Step 3) is working fine and with the performance numbers expected. But the second redirect at step 4 (that is from the interface we modified the packets on towards the same interface in egress direction using XDP and bpf_map_redirect / bpf_redirect) is dropping always around 70% of the incoming packets. That is for 1 Gbit/s (size 1500B) around 300 Mbit/s are achieved. The interesting part is now that the 70% seem to be consistent. When I am sending 4 Gbit/s of traffic (size 1500B) into the virtual interface I achieve 1 Gbit/s on the physical interface in the egress direction (Step 5). Therefore I know that the machine is theoretically capable of redirecting this amount of traffic.
I could reproduce the issue when only using the eBPF TC redirect towards the modifying interface (Step 3) and a minimal XDP program which redirects the packets directly using both bpf_map_redirect and bpf_redirect.

The eBPF TC program (step 3):

SEC("tc_redirect")
int cb_split(struct __sk_buff *sk_buf) {
  int iface = 5;
  return bpf_redirect(iface, BPF_F_INGRESS);
}

and the XDP program (step 4):

SEC("xdp_redirect")
int xdp_redirect_packet(struct xdp_md *ctx) {
    // or in case of the bpf_redirect
    // return bpf_redirect(5, 0);
    return bpf_redirect_map(&redirect_table, 0, XDP_PASS);
}

Distro: Ubuntu 20.04 LTS
Kernel: 5.4.0-45-generic
The drivers used for the physical interface:

driver: igb
version: 5.6.0-k
firmware-version: 1.63, 0x800009fb
expansion-rom-version: 
bus-info: 0000:0b:00.2
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

I already tried multiple things:

  • Sending from another machine testing the redirect behaviour of XDP and TC
    -> No drops, expected performance. So the issue seems to be with traffic generated on the sending machine maybe just a configuration error?
  • Using another interface to modify the packets and redirect the packets (from veth0 -> veth1 (modify, redirect) -> physical)
    -> Again 70% drops
  • Testing to redirect with eBPF TC instead of XDP
    -> Same drops of 70%

Tracing the packet drops with dropwatch showed me the following (exemplary) result:
"47040 drops at kfree_skb_list+1d (0xffffffffabf1e06d) [software]"

I'm running out of ideas what to try next and if it is my fault or some weird behaviour in XDP. I know that the use of XDP in my case is a little off but I still want to know why this behaviour appears.

@tohojo
Copy link
Member

tohojo commented Mar 1, 2021

Sorry, that diagram is not enough to explain what you're doing. Could you please list the traffic flow including all interfaces involved, how you generate the traffic, and which hooks are running which BPF programs?

@JustusvonderBeek
Copy link
Author

Sure, thx for the fast response.

I'm using two interfaces veth0 and eno5.
Eno5 is a physical interface connected to a 1 Gigabit Intel I350 interface card and veth0 is a virtual interface.

For the flow:

Interfaces

Step 1: Generating traffic with trafgen using the following command:

trafgen -i ./trafgen_1500 -o veth0 -b 1Gbit -P 1

where "trafgen_1500" contains the following:

{
    eth(da="destination MAC address of the machine I'm sending to", sa="source MAC address of eno5",type=0x8100)
    vlan(tci=2048,1q)
    ipv6(da="destination IP of the machine I'm sending to", sa="source IP of eno5")
    rnd(1442)
}

Because the traffic contains a VLAN tag I disabled VLAN offloading with ethtool for the interface eno5.

Step 2: Listening on egress packets on interface veth0 and redirecting the traffic towards the ingress direction of eno5. Using eBPF TC on the virtual interface "veth0":

SEC("tc_redirect")
int redirect(struct __sk_buff *sk_buf) {
  int iface_eno5= 5;
  return bpf_redirect(iface_eno5, BPF_F_INGRESS);
}

attached by:

sudo tc qdisc add dev veth0 clsact
sudo tc filter add dev veth0 egress prio 1 handle 1 bpf da obj ./tc_redirect.o sec tc_redirect

Step 3: Modifying (left out here because the issue also appeared without the modification) and redirecting the traffic on the physical interface "eno5" using the generic XDP mode. The redirect stays on the same interface "eno5":

SEC("xdp_redirect")
int xdp_redirect_packet(struct xdp_md *ctx) {
    // or in case of the bpf_redirect
    // return bpf_redirect(5, 0);
    return bpf_redirect_map(&redirect_table, 0, XDP_PASS);
}

I hope that clears some of the questions.

@tohojo
Copy link
Member

tohojo commented Mar 1, 2021 via email

@JustusvonderBeek
Copy link
Author

Yeah, it helps with understanding what you're doing. What's left is why would you do something like this? :)

I thought I could already write XDP code for the case when the XDP egress hook point gets ready. :)

I tested the redirects with counters and found that I receive all packets until the interface eno5. After the second redirect from eno5 towards egress they get dropped.

Something about CPU affinity, perhaps,

Is there a way to pin the execution on one specific CPU?

or maybe the packet generator is not generating complete packets (checksum error?).

Regarding the packet generator part, this would mean the packets would be dropped by the kernel on the receiving machine, right? Because this is not the case.

@tohojo
Copy link
Member

tohojo commented Mar 2, 2021 via email

@JustusvonderBeek
Copy link
Author

JustusvonderBeek commented Mar 3, 2021

It's possible to write BPF code that you can use on both the TC and XDP hooks; see this example: https://github.com/xdp-project/bpf-examples/tree/master/encap-forward

I'm not sure if I understand the example correctly but you probably mean the "encap.h" file used in both the TC and XDP implementation right? I guess I will give it a try then.

What's your application? If you're only targeting forwarded traffic (i.e., that goes through XDP_REDIRECT), there's already a hook in the devmap that is per map entry (which for redirected traffic semantically corresponds to a TX hook, just slightly earlier in the call chain).

Yes, the XDP program should handle forwarded traffic. But I don't understand how the hook in the devmap is supposed to work?

I tested the redirects with counters and found that I receive all packets until the interface eno5. After the second redirect from eno5 towards egress they get dropped.

Right, figured that would be the most likely place. So apart from the CPU or checksum issues I already mentioned, another possible reason is simply that the hardware is overwhelmed. XDP_REDIRECT bypasses the qdisc layer, so there's no buffering if the hardware can't keep up. So if the traffic generator is bursty I wouldn't be surprised if it could overwhelm the hardware...

I also thought about the fact that the network device cannot keep up with the speed or the copying takes too long. But I tested the same setup without limiting the traffic generator in throughput. This generates around 4Gbit/s of 1500B packets and results in around 1Gbit/s of packets on eno5. So the speed can be achieved, but somewhere in my admittedly confusing setup I lose / drop around 70% of the packets.

or maybe the packet generator is not generating complete packets (checksum error?).

Regarding the packet generator part, this would mean the packets would be dropped by the kernel on the receiving machine, right? Because this is not the case.

Not necessarily. There could be a check in the driver or hardware. Have you looks at the ethtool counters? (ethtool -S)?

So on the receiving machine I do see all packets that are seen by the interface eno5. That includes the counters from ethtool -S.
On the sending machine I count only the correctly redirect packets when using ethtool -S.

I also checked dropwatch again and now it is spitting out:

<num> drops at generic_xdp_tx+f1

So that seems to make sense but the question now is why? :)

@tohojo
Copy link
Member

tohojo commented Mar 3, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants