Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logic refax: new rules for the receiver buffer and TSBPD #2527

Draft
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

ethouris
Copy link
Collaborator

@ethouris ethouris commented Nov 10, 2022

DRAFT BECAUSE: the concept is going to be changed to use Offset instead of Position for End and Drop, but the quick change resulted in heavy bugs. The change was necessary due to reported performance problems.

This changes largely the logics of the buffer and TSBPD. This change was necessary for supporting the group common buffer reception, but it should as well improve the TSBPD performance and efficiency.

The previously working receiver buffer and cooperating with it the TSBPD thread were following the previous design of the per-ACK "eclipsing" packets and until then those packets were not available, even if their retrieval would be theoretically possible. This design changes the TSBPD to work directly on the buffer, therefore there have been consolidated less events to trigger the TSBPD wakeup and recheck. There are actually the following situations:

  1. TSBPD sleeps forever because the buffer is empty: wake it up when a new packet has been inserted into the buffer.
  2. TSBPD sleeps forever because it has declared packet at [0] cell as ready to play and expects that the API call retrieve it: the API call should wake it up after the packet has been retrieved (this is the old way and remains the same here).
  3. TSBPD sleeps up to the time of delivery of a packet at [0] cell. There should be no reason to wake it up whatsoever.
  4. TSBPD sleeps up to the time of delivery of a jump-over packet. In this situation, only an incoming packet that has an earlier delivery time than the one currently earliest should make TSBPD woken up.

Note that in the case of forever sleeping, there's no way to distinguish the reason of "still ready to read" and "nothing to wait up to" situations - might be that this can be improved, so - for example - if there is at least one packet "in the past" (not necessarily at cell [0]) then TSBPD should NOT be woken up in the event of incoming packet, regardless of its position.

Anyway, the change for triggering TSBPD is that:

  • it has been removed from ACK handler. In the file mode ACK is still in use for signing off packets as read-ready in the currently available number in the stream mode and when a full retrievable message can be identified in the message mode, and in TSBPD nothing happens at all
  • it has been removed from the case of loss detection and dropping on demand from the peer. Packets that have been received will be delivered anyway, so dropping doesn't change anything in the packet readiness. Note that this does change things in the message mode (a scrapped message that becomes in the past towards the drop request sequence does change the earliest ready-to-deliver message), but the message mode also doesn't use TSBPD
  • there has been added triggering of TSBPD after a packet has been added to the buffer, and under specific conditions: if there has been an "earlier time" delivered, or when the "forever sleeping" flag is set.

Beside the taking out of the TSBPD triggering in the ACK handler, there's also one more change added: the ACK number to be sent back to the sender is extracted exclusively from the receiver buffer, and no longer from the loss list and internal fields. This is because the receiver buffer was added a functionality to be able to check this thing quickly. This is also a more reliable source of information and not prone to any race conditions.

Changes in the receiver buffer:

Introduced are terms of "initial contiguous region" and "first gap". The first term existed in the old receiver buffer and it's the series of valid packets starting with the first cell up to the "first gap", that is, the series starting from at least one empty cell followed by a valid packet. None of them exist in an empty buffer. The "first gap" also may not exist if all packets from the first cell up to the one pointed by m_iMaxPosOff are valid. If the first gap starts with the first cell, this is also an "initial gap".

Two new flags were added: m_iEndPos and m_iDropPos:

  • m_iEndPos: marks the past-the-end of the initial contiguous region. If this region is empty, m_iEndPos is equal to m_iStartPos. This field always points to an empty cell.
  • m_iDropPos: marks the position of the first valid packet following the first gap. This position is always earlier than the one pointed by m_iMaxPosOff, but later than m_iEndPos. If no such packet exists, this position is equal to m_iEndPos.

These positions are being updated in case of various events that may influence their positions, so for example:

  1. When you want to extract a packet from the buffer, you can only retrieve one packet at position 0 at a time. This shifts the initial position by 1, and may potentially reach up to m_iEndPos, but this may only happen if the buffer becomes empty. Otherwise m_iEndPos and m_iDropPos are far in forward, so nothing to change here.
  2. When a new packet comes in from the network, and it was identified to be inserted, possibilities are:
    • position equal to m_iEndPos with m_iDropPos == m_iEndPos. Both are shifted by 1.
    • position equal to m_iEndPos with existing "first gap". If the next position towards m_iEndPos is busy, ride with m_iEndPos to find a new end of initial contiguous region, and then ride with m_iDropPos, from this position on, to find the after-gap position, both up to m_iMaxPosOff.
    • anywhere after m_iEndPos and after m_iDropPos: nothing to be updated
    • anywhere after m_iEndPos, but before m_iDropPos: set m_iDropPos back to m_iEndPos and ride up to m_iMaxPosOff to find the new after-gap position. NOTIFY TIME of the delivery for the new drop position.
  3. When a dropping is requested to remove the initial gap, then the buffer gets shifted to the position of m_iDropPos, then both m_iStartPos and m_iEndPos are set to this value, and then the procedure of finding the end of the contiguous region and the prospective end of first gap starts anew (updateGapInfo).
  4. The search for a new gap is not done when the insertion is done on m_iEndPos, and the end of buffer (from m_iMaxPosOff) is equal to it. In this case only the m_iEndPos and m_iDropPos are shifted by 1.

Note that there are many mutually-exclusive conditions here, for example:

  1. Duplicated packets are discarded in advance, and therefore no flags get modified. This means, for example, that there will never be an event of inserting a packet into the initial contiguous region or at the drop position. The m_iEndPos is the earliest position where insertion may happen.
  2. The m_iDropPos never precedes m_iEndPos, and it's enough to check if they are not equal to be sure that m_iDropPos points to a valid packet.
  3. There is no situation of retrieving multiple packets at once, so after retrieval of a single packet, these flags may at worst get equal to m_iStartPos, which means that the buffer is empty.
  4. Dropping means to discard all cells from the first one up to m_iDropPos, and if this happens, it was pointing to a valid packet. After that m_iStartPos is set to that position and searching for candidates to set to m_iEndPos begins with the next packet.

There are therefore changed the procedures of:

  1. Insertion. This delivers more information so that the conditions for triggering TSBPD can be properly determined.
  2. Dropping. There are many dropping procedures, but only one - this for TSBPD thread - simply shifts to the m_iDropPos. Dropping the whole message doesn't rely on it as in the message mode dropping is only for the whole message and results either in an empty buffer or resets the initial contiguous region and hence the other flags.

@codecov-commenter
Copy link

codecov-commenter commented Nov 10, 2022

Codecov Report

Attention: Patch coverage is 76.25571% with 52 lines in your changes are missing coverage. Please review.

Project coverage is 64.89%. Comparing base (12aad6e) to head (df237d6).

Files Patch % Lines
srtcore/buffer_rcv.cpp 76.53% 42 Missing ⚠️
srtcore/core.cpp 81.25% 6 Missing ⚠️
srtcore/buffer_rcv.h 33.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2527      +/-   ##
==========================================
+ Coverage   64.72%   64.89%   +0.16%     
==========================================
  Files         101      101              
  Lines       17543    17673     +130     
==========================================
+ Hits        11355    11469     +114     
- Misses       6188     6204      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ethouris ethouris added Type: Maintenance Work required to maintain or clean up the code [core] Area: Changes in SRT library core labels Nov 14, 2022
srtcore/buffer_rcv.cpp Fixed Show fixed Hide fixed
srtcore/buffer_rcv.cpp Fixed Show resolved Hide resolved
srtcore/core.cpp Fixed Show fixed Hide fixed
srtcore/core.cpp Fixed Show resolved Hide resolved
srtcore/buffer_rcv.cpp Fixed Show fixed Hide fixed
srtcore/buffer_rcv.cpp Fixed Show fixed Hide fixed
srtcore/buffer_rcv.cpp Outdated Show resolved Hide resolved
@ethouris ethouris marked this pull request as ready for review February 15, 2024 13:10
@maxsharabayko maxsharabayko added this to the v1.5.4 milestone Feb 15, 2024
srtcore/buffer_rcv.h Show resolved Hide resolved
srtcore/buffer_rcv.h Outdated Show resolved Hide resolved
srtcore/core.h Outdated Show resolved Hide resolved
srtcore/buffer_rcv.cpp Outdated Show resolved Hide resolved
srtcore/buffer_rcv.cpp Outdated Show resolved Hide resolved
srtcore/buffer_rcv.cpp Outdated Show resolved Hide resolved
srtcore/buffer_rcv.h Outdated Show resolved Hide resolved
srtcore/buffer_rcv.h Outdated Show resolved Hide resolved
srtcore/buffer_rcv.h Outdated Show resolved Hide resolved
srtcore/buffer_rcv.h Outdated Show resolved Hide resolved
srtcore/buffer_rcv.h Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/buffer_rcv.cpp Fixed Show resolved Hide resolved
ethouris and others added 3 commits February 28, 2024 11:03
(further impossible fixes pending)

Co-authored-by: Maxim Sharabayko <maxlovic@gmail.com>
Comment on lines +10686 to +10687
// XXX unsure as to whether this should change anything in the TSBPD conditions.
// Might be that this trigger is not necessary.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This signal is indeed no longer necessary because the TSBPD thread now gets notified when an earlier packet has been received.

@@ -5529,6 +5544,8 @@ void * srt::CUDT::tsbpd(void* param)
tsNextDelivery = steady_clock::time_point(); // Ready to read, nothing to wait for.
}

SRT_ATR_UNUSED bool bWakeupOnSignal = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to bWokeUpOnSignal to describe an event, not the intent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intended to be a substantive describing the performed activity: "a wakeup". But I agree this sounds ambiguous.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SRT_ATR_UNUSED bool bWakeupOnSignal = true;
SRT_ATR_UNUSED bool bWokenupOnSignal = true;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use either bWokeUpOnSignal or bWokenUpOnSignal.

// API reader thread sleeping until there is a "bigger portion"
// of data to read. In TSBPD mode this isn't done because every
// packet has its individual delivery time and its readiness is signed
// off by the TSBPD thread.
HLOGC(xtlog.Debug,
log << CONID() << "ACK: clip %" << m_iRcvLastAck << "-%" << ack << ", REVOKED "
<< CSeqNo::seqoff(ack, m_iRcvLastAck) << " from RCV buffer");

if (m_bTsbPd)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove this if case. Keep the else case.

Comment on lines +10651 to +10658
if (m_bTsbPd && ((m_bWakeOnRecv && new_inserted) || next_tsbpd_avail != time_point()))
{
HLOGC(qrlog.Debug, log << "processData: will SIGNAL TSBPD for socket. WakeOnRecv=" << m_bWakeOnRecv
<< " new_inserted=" << new_inserted << " next_tsbpd_avail=" << FormatTime(next_tsbpd_avail));
CUniqueSync tsbpd_cc(m_RecvLock, m_RcvTsbPdCond);
tsbpd_cc.notify_all();
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be that m_bWakeOnRecv still has to be protected by m_RecvLock to ensure the TSBPD thread has set it appropriately before it is checked here. Making it atomic does not resolve this transition state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
if (m_bTsbPd && ((m_bWakeOnRecv && new_inserted) || next_tsbpd_avail != time_point()))
{
HLOGC(qrlog.Debug, log << "processData: will SIGNAL TSBPD for socket. WakeOnRecv=" << m_bWakeOnRecv
<< " new_inserted=" << new_inserted << " next_tsbpd_avail=" << FormatTime(next_tsbpd_avail));
CUniqueSync tsbpd_cc(m_RecvLock, m_RcvTsbPdCond);
tsbpd_cc.notify_all();
}
if (m_bTsbPd)
{
CUniqueSync tsbpd_cc(m_RecvLock, m_RcvTsbPdCond);
if ((m_bWakeOnRecv && new_inserted) || next_tsbpd_avail != time_point())
{
HLOGC(qrlog.Debug, log << "processData: will SIGNAL TSBPD for socket. WakeOnRecv=" << m_bWakeOnRecv
<< " new_inserted=" << new_inserted << " next_tsbpd_avail=" << FormatTime(next_tsbpd_avail));
tsbpd_cc.notify_all();
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Comment on lines +1140 to +1143
// This function is to return the packet's play time (time when
// it is submitted to the reading application) of the given packet.
// This grp passed here by void* because in the current imp it's
// unused and shall not be used in case when ENABLE_BONDING=0.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This function is to return the packet's play time (time when
// it is submitted to the reading application) of the given packet.
// This grp passed here by void* because in the current imp it's
// unused and shall not be used in case when ENABLE_BONDING=0.
/// This function is to return the packet's play time (time when
/// it is submitted to the reading application) of the given packet.
/// This grp passed here by void* because in the current imp it's
/// unused and shall not be used in case when ENABLE_BONDING=0.

THREAD_PAUSED();
tsbpd_cc.wait_until(tsNextDelivery);
bWakeupOnSignal = tsbpd_cc.wait_until(tsNextDelivery);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bWakeupOnSignal = tsbpd_cc.wait_until(tsNextDelivery);
bWokenupOnSignal = tsbpd_cc.wait_until(tsNextDelivery);

THREAD_PAUSED();
tsbpd_cc.wait();
THREAD_RESUMED();
}

HLOGC(tslog.Debug, log << self->CONID() << "tsbpd: WAKE UP!!!");
HLOGC(tslog.Debug, log << self->CONID() << "tsbpd: WAKE UP [" << (bWakeupOnSignal? "signal" : "timeout") << "]!!! - "
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
HLOGC(tslog.Debug, log << self->CONID() << "tsbpd: WAKE UP [" << (bWakeupOnSignal? "signal" : "timeout") << "]!!! - "
HLOGC(tslog.Debug, log << self->CONID() << "tsbpd: WAKE UP [" << (bWokenupOnSignal? "signal" : "timeout") << "]!!! - "

@ethouris ethouris marked this pull request as draft April 24, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Maintenance Work required to maintain or clean up the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants