Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BWE's frozen on the same value periodically #115

Open
skmax opened this issue May 15, 2022 · 7 comments
Open

BWE's frozen on the same value periodically #115

skmax opened this issue May 15, 2022 · 7 comments

Comments

@skmax
Copy link

skmax commented May 15, 2022

Hi @murillo128,
During the testing of the BWE mechanism, I found two issues that lead to the case when the BWE state is always increasing and BWE target's kept on the same value. It leads to the situation when probing is not sent even though there's more available bitrate.

1. Probing is never sent if RTX history doesn't have appropriate packets.

On low bitrates the DTLSICETransport doesn't send any probing because each packet from RTX history for probing is much bigger than probing size that needs to be sent in each iteration. See https://github.com/medooze/media-server/blob/master/src/DTLSICETransport.cpp#L2613

Suggested solution: if any suitable packets can't be found in the history, generated probing should be sent

2. The calculation of probing which needs to be send looks like inconsistent

In DTLSICETransport we calculate the probing in the following way:

//Get bitrates
DWORD bitrate		= static_cast<DWORD>(outgoingBitrate.GetInstantAvg()*8);
DWORD probing		= static_cast<DWORD>(probingBitrate.GetInstantAvg()*8);
DWORD target		= senderSideBandwidthEstimator.GetAvailableBitrate();
//...
//Calculate how much bitrate should we sent
probing = target - (bitrate - probing);

Calculation of target and bitrate is based on the same concept of outgoing bitrate, but the outgoing bitrate value is getting from different accumulators. E.g. target bitrate calculation is based on outgoingBitrateAccumulator in senderSideBandwidthEstimator, but probing calculation is based on outgoingBitrateAccumulator from DTLSICETransport.
It looks like it doesn't make a lot of sense to have two accumulators here. Moreover, sometimes values from those both accumulator getting completly out of sync which leads to wrong probing calculation.

In the following logs, you can see that outgoingBitrateAccumulator from senderSideBandwidthEstimator has instant value 877728bps, but outgoingBitrateAccumulator from DTLSICETransport has instant value 294880bps. The value in DTLSICETransport is being always significantly higher during the connection. Due to this fact the estimator is stuck on increasing state, but no probing value is being sent.

[0x70000c206000][1652558433.498][LOG]-SendSideBandwidthEstimation::EstimateBandwidthRate() [this:0x10a2316f0,estimate:1774492bps,target:1774492bps,available:1774492bps,sent:877728bps,recv:875488bps,rtx=0bps,state:1,delta=0,media:1977670,rtx:0,overhead:0.00,rttEstimated:1,rttMin:1
[0x70000c206000][1652558433.498][LOG]>DTLSICETransport::Probe() | [target:282906,bitrate:294880,probing:18240,probingResult:6266,probingSize:22,history:10,probingBitrateLimit=999999999,maxProbingBitrate=9999999,probingBitrateInstant=570,now=-1003975462]
[0x70000c206000][1652558433.501][LOG]-SendSideBandwidthEstimation::EstimateBandwidthRate() [this:0x10a3416f0,estimate:282906bps,target:282906bps,available:282906bps,sent:173824bps,recv:173824bps,rtx=0bps,state:1,delta=0,media:328339,rtx:0,overhead:0.00,rttEstimated:1,rttMin:1
[0x70000c206000][1652558433.508][LOG]>DTLSICETransport::Probe() | [target:282906,bitrate:294880,probing:18240,probingResult:6266,probingSize:22,history:10,probingBitrateLimit=999999999,maxProbingBitrate=9999999,probingBitrateInstant=570,now=-1003975452]
[0x70000c206000][1652558433.518][LOG]>DTLSICETransport::Probe() | [target:282906,bitrate:296416,probing:18240,probingResult:4730,probingSize:22,history:10,probingBitrateLimit=999999999,maxProbingBitrate=9999999,probingBitrateInstant=570,now=-1003975442]
[0x70000c206000][1652558433.528][LOG]>DTLSICETransport::Probe() | [target:282906,bitrate:296448,probing:18240,probingResult:4698,probingSize:22,history:10,probingBitrateLimit=999999999,maxProbingBitrate=9999999,probingBitrateInstant=570,now=-1003975432]
[0x70000c206000][1652558433.538][LOG]>DTLSICETransport::Probe() | [target:282906,bitrate:271552,probing:9120,probingResult:20474,probingSize:11,history:10,probingBitrateLimit=999999999,maxProbingBitrate=9999999,probingBitrateInstant=285,now=-1003975422]
[0x70000c206000][1652558433.538][LOG]-DTLSICETransport::Probe() Inner | Sending probing packets [target:282906,bitrate:271552,probing:20474,max:9999999,probingSize:25,sleep:10]
[0x70000c206000][1652558433.548][LOG]-SendSideBandwidthEstimation::EstimateBandwidthRate() [this:0x10a2316f0,estimate:1774492bps,target:1774492bps,available:1774492bps,sent:835040bps,recv:832800bps,rtx=0bps,state:1,delta=0,media:1982793,rtx:0,overhead:0.00,rttEstimated:1,rttMin:1
[0x70000c206000][1652558433.548][LOG]>DTLSICETransport::Probe() | [target:282906,bitrate:280672,probing:18240,probingResult:20474,probingSize:22,history:10,probingBitrateLimit=999999999,maxProbingBitrate=9999999,probingBitrateInstant=570,now=-1003975412]
[0x70000c206000][1652558433.548][LOG]-DTLSICETransport::Probe() Inner | Sending probing packets [target:282906,bitrate:280672,probing:20474,max:9999999,probingSize:25,sleep:10]

Suggested solution: use outgoingBitrateAccumulator from senderSideBandwidthEstimator in DTLSICETransport::Probe in order to be consistent in calculations

Please, check the PR with suggested solutions and let me know what you think.
Thanks

@murillo128
Copy link
Member

Hi @skmax, thank you for the issue report and the PR!

Probing is never sent if RTX history doesn't have appropriate packets.

If you check the code, the size for the probe is calculated based on the period of the last probe was sent. So if the packets on the queue are too big, the size will be accumulated for the next time the probe timer is fired and eventually they will be sent.

I would prefer to avoid sending padding only rtx packets for probing if there are any on the history.

The calculation of probing which needs to be send looks like inconsistent

The outgoing bitrate on the DTLSICETransport is updated when the Probe method is called, while the one on the SendSideBandwidthEstimation is updated when the packet is sent. So the DTLSICETransport one should be a bit lower than the one in the SendSideBandwidthEstimation, but there should not be such a huge difference.

I will double check and see if there is an issue when printing the value.

@skmax
Copy link
Author

skmax commented May 28, 2022

If you check the code, the size for the probe is calculated based on the period of the last probe was sent. So if the packets on the queue are too big, the size will be accumulated for the next time the probe timer is fired and eventually they will be sent.

Yes, it's been accumulating, but sometimes on small bitrates I can see that it never gets to the point where the instant probeSize is bigger than at least one packet in the history if we're just increasing the target bitrate for 5% on each estimation tick.
Also, I think that it might be not a good idea to send really huge packet right away. The router might drop it because the real bandwidth limitation thus we won't be able to estimate the bandwidth correctly. I suppose it's better to send the probing uniformly during the second instead of occasuanly send a relativly huge amount of data.

The outgoing bitrate on the DTLSICETransport is updated when the Probe method is called, while the one on the SendSideBandwidthEstimation is updated when the packet is sent. So the DTLSICETransport one should be a bit lower than the one in the SendSideBandwidthEstimation, but there should not be such a huge difference.

I will double check and see if there is an issue when printing the value.

Ok, please, let me know what you found regarding the issue. But I doubt it's a pringting problem. The probing started to work more accuratly after using the same accumulator in DTLSICETransport (see PR, please)
Are there any particular reasons to use both accumulalators instead of a single one? As far as I can see it's used only for bandwidth estimation. For me it looks like it makes two points of truth instead of one and the code should take care that both of them are in sync.

@skmax
Copy link
Author

skmax commented Jul 6, 2022

@murillo128 What are your thoughts on this?

Please, let me know if you need more information or logs

@murillo128
Copy link
Member

I am bit hesitant to commit a change in the BWE code without a reliable way of reproducing the issue. Would it be possible to check if the issue is still reproducible with the latest media server?

@skmax
Copy link
Author

skmax commented Aug 10, 2022

Hi @murillo128
BWE doesn't work for me at all in the latest version of media server. Here is the setup: https://github.com/skmax/media-server-demo-node/tree/bwe-debug-demo-latest

Also, here is a demo page with the issue that I described above (it uses v0.113.2): https://github.com/skmax/media-server-demo-node/tree/bwe-debug-demo
BWE freeze from both cases are relatively quickly reproducible for me there, especially in the static or low-quality video when the receiving bitrate's quite low

@murillo128
Copy link
Member

could you describe the steps to reproduce the issue an what is the actual vs expected result?

I ran the bwe-debug-demo-latest and it is working fine. One last thing, are you running it on linux or mac?

murillo128 added a commit that referenced this issue Sep 8, 2022
@murillo128
Copy link
Member

I think I have found the issue, please test with the latest version on the github project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants