New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] packets are not being delivered to the application using default SRT_LIVE with a broadcast bonded connection in localhost network #2871
Comments
A pcap would be helpful. Debug logs as well, but I'd limit them to only selected FA to avoid overloading the application, so I'll try to determine this later. Collecting statistics could be also helpful to see how the packet drops look like (some packets might be dropped already by the sender). With TSPBD turned off, TLPKTDROP is ignored anyway as well as e.g. LATENCY. The log with "BEGIN ASYNC MODE" is reported from the connecting function to declare that it is using the non-blocking connection mode, that is, it returns immediately. |
I have attached the pcap files one where the test passes and one for where the test fail (hangs)
|
Note that pcap files should have "pcap" extension, this one was misinterpreted as a text file. Whatever, I got them. I don't understand anything from this. This "working" one contains sending 10 packets and it's recorded from the handshake up to the shutdown. The transmission was so slow that ACK was received after every packet. Not sure why. The "hanging" version looks exactly the same, except that there aren't any shutdown packets and it looks like cut after 10th packet. If your test contains such a slow transmission, I think you can just as well turn on debug logs without filtering. I should be able to determine something from them for the "hanging" case. |
The rate at which I am generating packets is quite low in this specific case, i.e. one every 10ms. Though in case of the hanging issue this remains even if I change it to e.g. 1 ms
That is because the test itself times out. If needed, I can provide a longer trace.
I have attached log files for both scenarios These are wrapped in our own logging format but should contain the full log string from SRT as well |
From the description the issue looks like the one fixed in #2766. But you state you test SRT v1.5.3, that already contains the fix. |
I would suggest to log epoll events set by SRT. You can find them in the code as |
I will see if I can collect the logs this week.
|
Describe the bug
I have a set of tests that covers integration of the SRT protocol, using the non-blocking API, in various scenarios with a larger program.
The test basically produces a fixed amount of packet send the through the SRT integration and check whether the expected amount is received on the other side over a localhost connection, i.e. internal on the same machine.
Here, I have observed that sometimes when using a broadcast bonding connection with the default SRT_LIVE preset, not all packets are getting through to the application on the other side. This is not a deterministic bug but rather sporadic, which means that most of the time the test passes without issue. Furthermore, if I disable the TLPKT_DROP and TSBPD, the issue goes away.
In a similar test with only a single connection using the same scaffolding around the SRT API, this issue has not been observed even once.
To Reproduce
The larger program is not publicly available, but all I have to do rerun the same test until it fails, so if further logs are needed please let me know.
Expected behavior
I would expect this kind of test to always pass, as this is over the localhost.
Desktop (please provide the following information):
Additional context
From what I can see in the SRT related logging, it seems that when the test fail, I do not observe as many (e.g. 1) "BEGIN ASYNC MODE" as with the cases where it passes (e.g. 10). So maybe I am using the API incorrectly somehow?
Thanks in advance.
The text was updated successfully, but these errors were encountered: