Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing marker bit when switching from generated ringback to actual media #2361

Open
olofssonanton opened this issue Jan 17, 2024 · 9 comments · May be fixed by #2370
Open

Missing marker bit when switching from generated ringback to actual media #2361

olofssonanton opened this issue Jan 17, 2024 · 9 comments · May be fixed by #2370
Labels
bug Something isn't working

Comments

@olofssonanton
Copy link
Contributor

olofssonanton commented Jan 17, 2024

Describe the bug
When using ringback and ignore_early_media, there is a brief gap in packets sent to the A-part after receiving 200 OK from B. In order to prevent this gap from causing an unnecessary slow-down in playback for A (due to jitter buffer), there should be a marker bit inserted when the media sent to A switches from generated ringback to actual media from B.

See below image for a screenshot with comments from that exact moment in a pcap.

image

To Reproduce
Steps to reproduce the behavior:

  1. Using ringback and ignore_early_media, like ringback=%(1000,4000,425.0) and ignore_early_media=true
  2. Make a call using Verto
  3. Using Sofia, bridge it to an endpoint which will send early media
  4. A brief gap in packets received can be observed for A. This can be detected by tracing, or looking at packetsReceived/s in chrome://webrtc-internals under "Stats graphs for inbound-rtp".

Expected behavior
Either no gap in packets being sent, or a market bit in the first packet which is sent after the gap.

Package version or git hash

  • Version 1.10.10 (Edit: Also on 1.10.11)

Trace logs
pcap file of the related media streams has been sent to Brian West.

@olofssonanton olofssonanton added the bug Something isn't working label Jan 17, 2024
@olofssonanton
Copy link
Contributor Author

All of this is 100% just a matter of a missing marker bit when A's ringback stream is switched for the actual media stream from B - it's super apparent when adding instant_ringback into the mix.

The call starts bridging, and generated ringback immediately starts being sent to A with a timestamp starting from 0.
After td amount of time, the B side then responds with early media, also with timestamps starting from 0.
These two streams then have a more or less constant timestamp delta of td between them.
When B answers, FreeSWITCH discards the ringback stream, which was currently at timestamp tA, and A starts receiving media from B, currently at timestamp tB.
These new packets, n (starting from 0), suddenly have a timestamp of tA - td + (n * samples_per_interval), i.e. they are td amount of time old from A's perspective.
Since A has no idea that this is a new stream (as no SSRC has changed either), all packets from time tA until tA + td, when tB + td ≥ tA, will just look like duplicates.

In summary, a marker bit must be inserted when switching streams. Otherwise the receiver has no idea what just happened.

@olofssonanton
Copy link
Contributor Author

Further investigations indicate that Chrome (version 120) does not flush the jitter buffer and reset the stream upon receiving a marker bit, and thus a change of SSRC seems to be necessary.

The associated comment to CHANGE_SSRC_ON_MARKER suggests that FreeSWITCH will change the SSRC by default in this scenario, but we can see here that CHANGE_SSRC_ON_MARKER needs to be enabled in order to produce this behaviour.

In the same condition as mentioned above, there's also a check for the flag SWITCH_RTP_FLAG_SECURE_SEND, which is what ultimately prevents the change in SSRC in this case when using Verto (WebRTC/SRTP). This prevention was added way way back, but I find it peculiar that the condition was inverted between these two commits:
f4715781dc60293f89027ddf5ae4cc41badb5979
2fac3a8e83be837ff26d1010feb2be50119a10e8
The latter of the two mentions a refactor of video passthrough, so I'm wondering if this could be an oversight.

Anyhow, I made the necessary changes to run some tests and it all worked exactly as I had hoped it would - changing the SSRC fixes the issue caused by the apparent jump backwards in timestamps. I will get a PR up for this shortly.

For posterity; the gap in packets mentioned at the start of this issue becomes a non-issue by changing the SSRC.

olofssonanton added a commit to briteback/freeswitch that referenced this issue Jan 23, 2024
…allow SSRC change when using SRTP

Fixes signalwire#2361, where a jump backwards in RTP timestamps caused silence at the start of calls when switching from generated ringback to remote media.
@jakubkarolczyk
Copy link
Collaborator

Can you apply the below patch to the latest master and show me JDK... log lines?

diff --git a/src/switch_rtp.c b/src/switch_rtp.c
index 1125e2f59b..6ded1276ba 100644
--- a/src/switch_rtp.c
+++ b/src/switch_rtp.c
@@ -1561,6 +1561,14 @@ static uint8_t get_next_write_ts(switch_rtp_t *rtp_session, uint32_t timestamp)
                /* Send marker bit if timestamp is lower/same as before (resetted/new timer) */
                if (abs((int32_t)(rtp_session->ts - rtp_session->last_write_ts)) > rtp_session->samples_per_interval
                        && !(rtp_session->rtp_bugs & RTP_BUG_NEVER_SEND_MARKER)) {
+
+                       switch_log_printf(SWITCH_CHANNEL_SESSION_LOG(rtp_session->session), SWITCH_LOG_INFO,
+                             "JDK Increment m=[%d] rtp_session:[%p] rtp_session->ssrc:[%d] rtp_session->ts:[%d] "
+                             "rtp_session->last_write_ts:[%d] rtp_session->samples_per_interval:[%d] "
+                             "abs((int32_t)(rtp_session->ts - rtp_session->last_write_ts)) = [%d]\n",
+                             m, (void *)rtp_session, rtp_session->ssrc, rtp_session->ts, rtp_session->last_write_ts,
+                             rtp_session->samples_per_interval,
+                             abs((int32_t)(rtp_session->ts - rtp_session->last_write_ts)));
                        m++;
                }
        }

@olofssonanton
Copy link
Contributor Author

@jakubkarolczyk Sorry for the delay, but here we go!

I got one such log, at the moment when ringback starts being generated for the caller:
2024-02-27 14:02:20.610384 91.43% [INFO] switch_rtp.c:1565 JDK Increment m=[0] rtp_session:[0x7fec74184918] rtp_session->ssrc:[-837666077] rtp_session->ts:[13440] rtp_session->last_write_ts:[0] rtp_session->samples_per_interval:[960] abs((int32_t)(rtp_session->ts - rtp_session->last_write_ts)) = [13440]

@jakubkarolczyk
Copy link
Collaborator

Thanks @olofssonanton . Interesting, in my tests I always hit it second time when the actual audio starts after the ringback (so first packet after the gap you described). So I don't see the marker bit problem, what you reported.
Can you show me your dialplan, if possible?

@olofssonanton
Copy link
Contributor Author

olofssonanton commented Feb 27, 2024

Interesting indeed! Can you share some lines of debug log context around your second hit @jakubkarolczyk?

@jakubkarolczyk
Copy link
Collaborator

Do you have rtp_rewrite_timestamps chan var set by the chance?

@olofssonanton
Copy link
Contributor Author

@jakubkarolczyk rtp_rewrite_timestamps is not used

@LegendaryTJC
Copy link

LegendaryTJC commented May 22, 2024

The biggest issue I see with this is that the timestamp is going to a lower value (inherited from the other leg) but the SSRC doesn't change. RFC 3550 says that the timestamp MUST be monotonically increasing. Genesys PBXs drop those because they think the packets are too long ago to be relevant, although most applications don't seem so strict.

We can't rely on the marker bit here because RTP is sent over UDP and we can't guarantee that a single packet will reach the end. Change the SSRC, which even feels correct because the source of the media has indeed changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants