Dropped packets when capturing from multiple interfaces #1220

solemnwarning · 2023-09-07T16:58:57Z

I'm having issues with capturing from multiple devices in the same process using libpcap on Linux.

Unfortunately I haven't been able to reduce this to a simple test case - I can only reproduce it as part of a big regression test suite for some other software, but as far as I've been able to figure out, when pcap (via the Net::Pcap Perl module) is opened on multiple devices concurrently, not all of the captures actually receive packets.

This was introduced in libpcap 1.5.0 and is still present on master, specifically this commit:

commit 8ada1d5b98ac62c4ae9acbecb0639beeebd8a359 (refs/bisect/bad)
Author: Gabor Tatarka <gabor.tatarka@ericsson.com>
Date:   Thu Oct 17 15:12:09 2013 +0200

    Added TPACKET_V3 support.

So I'm not really sure where the problem is - whether its the TPACKET_V3 support in the kernel, the TPACKET_V3 support in libpcap or some other behaviour in the library that was changed by the same commit.

I'm hoping someone more familiar with pcap might know what's happening here. Happy to test any patches/theories.

The text was updated successfully, but these errors were encountered:

infrastation · 2023-09-08T06:48:06Z

Since you mention TPACKET_V3, this must be Linux. Does the setup use the any pseudo-interface or parallel independent captures, each on a separate interface? If it is the latter, does the software drain the buffers in a multi-threaded or a single-threaded fashion?

I also wonder if the immediate delivery mode and the buffer size are factors here.

solemnwarning · 2023-09-08T08:49:55Z

Its all single-threaded. It uses a separate capture for each interface, kicks off the processes which will generate the traffic, sleeps to let things settle and then reads in each capture's buffer with pcap_dispatch().

On Linux, with previous releases of libpcap, capture devices are always in immediate mode; however, in 1.5.0 and later, they are, by default, not in immediate mode, so if pcap_set_immediate_mode() is available, it should be used.

That sounds like it could be relevant, weird that it doesn't seem to affect all the capture interfaces at a time but I'll do some testing with it tonight.

guyharris · 2023-09-08T09:07:39Z

ts all single-threaded. It uses a separate capture for each interface,

Presumably means it opens a separate pcap_t (or whatever Net::Pcap object has a pcap_t) for each interface.

and then reads in each capture's buffer with pcap_dispatch().

Does this mean it does something such as

for (each capture device handle)
    pcap_dispatch(that handle);

i.e., that it proceeds sequentially through all the interfaces, processing them one at a time?

solemnwarning · 2023-09-08T09:43:07Z

@guyharris yes to both

guyharris · 2023-09-08T10:15:12Z

yes to both

Libpcap doesn't guarantee that will work.

In particular, the pcap_dispatch() call could block for a long period of time if no packets arrive on that interface for a long period of time.

What you should do is, first, to put all of the pcap_ts into non-blocking mode.

Then, do something such as (C-style pseudo-code):

create an empty set of file descriptors.
for (each capture device handle) {
    get the result of `pcap_selectable_fd()` on that handle;
    if (that result is not -1)
        add it to the set of file descriptors;
}
for (;;) {
    set a `struct timeval` to a huge timeout;
    for (each capture device handle) {
        get the results of `pcap_get_required_select_timeout()` on that handle;
        if it's not NULL {
            if it's less than the value in the aforementiond `struct timeval`
                set that `struct timeval` to this value;
        }

        do a `select()`/`poll()`/`epoll()`/etc. on the specified set of file descriptors, checking for readability, and using the `struct timeval`'s value as the timeout if it's not the very large amount of time you set it to;
        if a timeout occurred
           call `pcap_dispatch()` on all the capture device handles that returned a non-null required select timeout;
        else {
            for (all file descriptors that are readable)
                call `pcap_dispatch()` on the capture device handle with that descriptor as its selectable file descriptor;
        }
    }
}

solemnwarning · 2023-09-09T18:59:50Z

@infrastation thanks for the pointer, it was the immediate delivery mode. I patched a call to pcap_set_immediate_mode() into Net::Pcap and all is well. Net::Pcap only exposes the pcap_open_live() API atm, so I'll take this up over there.

@guyharris what doesn't libpcap guarantee here? The capture is already in non-blocking mode so that isn't a problem, and the buffer is more than large enough to accomodate the whole capture.

guyharris · 2023-09-09T19:37:50Z

[UPDATED: fixed the last sentence to say pcap_setnonblock() rather than pcap_set_immediate_mode().)

I patched a call to pcap_set_immediate_mode() into Net::Pcap and all is well.

There's a tradeoff between immediate and non-immediate mode. Immediate mode delivers packets immediately, so you get one wakeup per packet, so that's one system call per packet and, in systems where packet data is copied from the kernel to userland (Linux isn't such a system unless you have a really really old kernel and an old version of libpcap), that's one kernel-to-user copy per packet. Without immediate mode, packets can be delivered in batches, with one wakeup and one system call (and, without memory-mapped capture, one copy) per batch, which is more efficient.

This means that running in immediate mode could increase the chances of packet drops if you're getting high traffic.

The capture is already in non-blocking mode

In other words, you've called, for each of the capture handles, whatever Net::Pcap call results in calling pcap_setnonblock() on that handle?

solemnwarning mentioned this issue Sep 7, 2023

Test suite doesn't work with recent libpcap versions solemnwarning/ipxwrapper#2

Open

infrastation added the generic support label Sep 8, 2023

solemnwarning mentioned this issue Sep 9, 2023

Support for pcap_set_buffer_size ? maddingue/Net-Pcap#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dropped packets when capturing from multiple interfaces #1220

Dropped packets when capturing from multiple interfaces #1220

solemnwarning commented Sep 7, 2023

infrastation commented Sep 8, 2023

solemnwarning commented Sep 8, 2023

guyharris commented Sep 8, 2023

solemnwarning commented Sep 8, 2023

guyharris commented Sep 8, 2023

solemnwarning commented Sep 9, 2023

guyharris commented Sep 9, 2023 •

edited

Dropped packets when capturing from multiple interfaces #1220

Dropped packets when capturing from multiple interfaces #1220

Comments

solemnwarning commented Sep 7, 2023

infrastation commented Sep 8, 2023

solemnwarning commented Sep 8, 2023

guyharris commented Sep 8, 2023

solemnwarning commented Sep 8, 2023

guyharris commented Sep 8, 2023

solemnwarning commented Sep 9, 2023

guyharris commented Sep 9, 2023 • edited

guyharris commented Sep 9, 2023 •

edited