Some TCP keepalives corrupt the extracted data streams #253

kjgrahn · 2023-06-18T06:42:18Z

This bug report is unfortunately vague, but it might interest you anyway since it's a pure TCP segment assembly problem, having nothing to do with HTTP et cetera.

At work we asked for and got pcap files from a customer. My plan was to use tcpflow to extract the TCP streams for further processing, but when I did that, I discovered corruption of single octets here and there.

The TCP connection used SO_KEEPALIVE heavily, with maybe 5 seconds between probes. I also have reason to believe one peer was running some BSD derivate (because the customer calls these machines "SOMETHING-BSD").

What I think happened, was:

The peer used the old-fashioned TCP keepalive mechanism mentioned in Stevens' books[1], where the last acked octet is retransmitted as a 1-octet segment.
The peer chooses to send a random octet, since it's already acked and forgotten.
tcpflow chooses this latest (random) octet instead of the first (acked) one, and thus doesn't record the same stream as an application would see. For every segment followed by a keepalive probe, the last octet is mangled.

I understand you'd like an example pcap file, but I cannot distribute the data. I spent some time at home trying to reproduce this with OpenBSD, but it seems not to have this variant of keepalives. Linux of course doesn't. I suppose the customer used either NetBSD or FreeBSD, possibly an ancient release.

Another sad fact is I used an ancient tcpflow: the one in RHEL7 so I guess it would have been 1.4.5. I see keepalive support was added before that, in tcpflow-1.4.0beta1-129-g9915ef4. I could have used tcpflow from Ubuntu 22 and maybe I did, but I cannot easily find out now (this all happened in April).

[1] Quoting Stevens (TCP/IP Illustrated vol 1, p 335):

Some older implementations based on 4.2BSD do not respond to
these keepalive probes unless the segment contains data. Some
systems can be configured to send one garbage byte of data in the
probe to elicit a response. The garbage byte causes no harm,
because it's not the expected byte (it's a byte that the receiver
has previously received and acknowledged) so it's thrown away by
the receiver. Other systems [...]

simsong · 2023-07-02T01:14:47Z

Hi. I understand that you cannot post the pcap from your confidential data. However, perhaps you can use the system in quest to create a PCAP file that exhibits the problem? If you cannot use the system in question, perhaps you could spin up a RHEL7 system somewhere? I simply cannot debug this without a pcap file that demonstrates the problem.

Thanks.

kjgrahn · 2023-07-03T07:45:56Z

I'll see what I can do, but it won't be easy. It's not RHEL7 that I need mainly, but a system with that 4.2BSD quirk in its TCP implementation and I don't know which ones have it ...
I was hoping you'd immediately see the bug (remember how you reasoned about overlapping segments) but I wouldn't start changing that without test data, either ...

kjgrahn · 2023-07-03T07:47:29Z

Sorry, I'm not familiar with this issue tracker, and didn't intend to close the isse. Reopen.

simsong · 2023-07-03T10:57:20Z

I'll see what I can do, but it won't be easy. It's not RHEL7 that I need mainly, but a system with that 4.2BSD quirk in its TCP implementation and I don't know which ones have it ... I was hoping you'd immediately see the bug (remember how you reasoned about overlapping segments) but I wouldn't start changing that without test data, either ...

By policy, I won't make changes without having test data so that the bug and then the fix can both be validated.

simsong added the Needs Test Data label Jul 2, 2023

simsong assigned kjgrahn Jul 2, 2023

kjgrahn closed this as completed Jul 3, 2023

simsong reopened this Jul 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some TCP keepalives corrupt the extracted data streams #253

Some TCP keepalives corrupt the extracted data streams #253

kjgrahn commented Jun 18, 2023

simsong commented Jul 2, 2023

kjgrahn commented Jul 3, 2023

kjgrahn commented Jul 3, 2023

simsong commented Jul 3, 2023

Some TCP keepalives corrupt the extracted data streams #253

Some TCP keepalives corrupt the extracted data streams #253

Comments

kjgrahn commented Jun 18, 2023

simsong commented Jul 2, 2023

kjgrahn commented Jul 3, 2023

kjgrahn commented Jul 3, 2023

simsong commented Jul 3, 2023