Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC: WebRTC Simulcast #55

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Sean-Der
Copy link

@Sean-Der Sean-Der commented Jul 3, 2023

Summary

Add Simulcast support to WebRTC output.

Simulcast is a WebRTC protocol feature that allows a uploader to send layers of one track. This is built into the protocol so every WebRTC
ingest already understands this. These layers can be different resolutions, bitrates and even codecs.

Motivation

Live streaming services offer videos at multiple bitrates and resolutions. This is needed to support the wide variety of connections that users will have.
Today streaming services decode the incoming video, modify and then re-encode to generate these different quality levels. This has some draw backs that Simulcast
will fix or improve.

  • Generation Loss - Decoding and re-encoding videos causes generation loss. Simulcast means encodes come from the source video which will be higher quality.

  • Higher Quality Encodes - Streamers with dedicated hardware can provide higher quality encodes. Streaming services at scale are optimizing for cost.

  • Lower Latency - Removing the additional encoding/decoding allows video to be delivered to users faster.

  • Reduce server complexity - Users find it difficult to setup RTMP->HLS with transcodes. With Simulcast setting up a streaming server becomes dramatically easier.


Link to RFC

@murillo128
Copy link

I think that we can do improve the Basic behavior of simulcast.
By default it would be better to use always the top layer as it is configured in OBS when using simulcast, and them downscale the other two simulcast layers by 1.5 and 2 in dimensions and by 2 and 4 in bitrate. So for example, if we have 1080p 6mbps, we would have 720p at 3mbps and 540p at 1.5mbps, and if we are using 720p at 3mbps, we have 480p at 1.5mbps and 360p at 750kbps. This is how Chrome has been working for several years until more advanced apis were available to control the individual layer encoder parameters.

In Advanced config, the user should be able to configure the number of simulcast layers and width/height/fps and bitrate of the lower simulcast layers.

Given that Simulcast is negotiated on the SDP O/A, we could even enable simulcast always, and depending if the server supports it or not, start the lower encodings as needed.

Simulcast is a very important feature for us, and if implemented, it would allow us to deprecate our OBS-webrtc fork and focus on contributing to the main OBS instead. So please, just let me know what can I do in order to meet all the requirements both in the RFC and in the implementation.

@Warchamp7
Copy link
Member

A few initial questions:

  • What needs to be able to be configured on a layer for this to be valuable both for users and services? Resolution, bitrate, anything else?

  • Do we need a way to limit or recommend layer settings per users based on service? Or are services expected to serve whatever layers they receive?

  • Each layer will need an additional encoder spun up to be able to do different resolution, bitrate, and whatever else we allow configuration for. This is a significant performance cost we will need to make clear to users for one thing. More importantly though is that to my knowledge hardware encoders are limited in the number of sessions they can run, and there is no API or method to derive that information. This is a problem that will need to be solved, potentially with work from NVIDIA/AMD

@murillo128
Copy link

thanks for your feedback @Warchamp7 !

regarding your questions:

  • Resolution and bitrate are enough, fps could be marginally useful.

  • The server support for simulcast is negotiated on the SDP. The number of layer can also be negotiated, although typically the servers accepts whatever the client publish. On webrtc browsers the maximum number of simulcast layers is 3, so I would expect some issues when sending more than 3 layers, but nothing the media servers won't be able to adapt to. FWIW on dobly.io/millicast we accept whatever number of simulcast layers offered by the client.

  • As said before, even just starting with 3 layers (which I think should be supported by most GPUs) , would be a huge success. If the user wants to send more than 3, and the GPU doesn't support them, we could use SW encoding for the lower layers instead, which should not consume as much cpu as the high layers.

@Warchamp7
Copy link
Member

  • The server support for simulcast is negotiated on the SDP. The number of layer can also be negotiated, although typically the servers accepts whatever the client publish.

As someone a bit less technical, can you elaborate on this? That sounds to me like the info is transmitted upon session start. If that's the case, then are users expected to configure X many layers and simply be hit with an error on session start if it's too many for the selected service?

Ideally a user can select their service/endpoint, and be presented with information on how many layers they can configure, and any restrictions/recommendations the service might have.

I very much don't like the idea of users having to simply set things up and hope it'll work. Worst case scenario we may have to hardcode limits into services.json with our other service recommendations.

  • As said before, even just starting with 3 layers (which I think should be supported by most GPUs) , would be a huge success. If the user wants to send more than 3, and the GPU doesn't support them, we could use SW encoding for the lower layers instead, which should not consume as much cpu as the high layers.

My concern is with detecting "doesn't support". Most (all?) NVIDIA consumer cards have a hard limit of 5 simultaneous sessions, but the realistic limit can be less than that, based on the demands of the sessions. Similarly for AMF, whereas they don't have a session limit, I think you'll struggle to get more than 2 or 3. I do not believe there is a way to detect available sessions or resources, it will simply fail when they attempt to start their output. Performing silent fallback to software encoding could lead to an unexpected performance impact every time they begin an output if less sessions/resources are available one time vs another.

When we are only spinning up a single encoder, this is a binary problem. It either works or it doesn't. The introduction of layers means that on any given day and system usage, 3 layers might work some times but not others. I want to make sure we are adequately able to communicate issues to users and have proper error handling to solve them.

@murillo128
Copy link

The simulcast negotiation in the SDP is described in detail here:
https://www.rfc-editor.org/rfc/rfc8853.html

TL;DR; the client sends an offer with a simulcast attribute and the rids(encodings) that they want to send

a=rid:0 send
a=rid:1 send
a=rid:2 send
a=simulcast:send 0;1;2

and the server accepts them reversing the send/recv

a=rid:0 recv
a=rid:1 recv
a=rid:2 recv
a=simulcast:recv 0;1;2

If the server does not accept simulcast, it will not include the simulcast attribute and the client will just send one encoding as normally.

In theory, the client could also specify the video encoding properties in the offer and the server accept the ones they want, but in reality the server always accepts everything that is sent from the client.

Regarding the maximum number of layers, there would be not a problem not sending all the encodings offered from OBS. Webrtc servers are already used to have dinamic number of inputs, as browsers may drop (stop sending) simulcast layers based on the cpu/bandwith use.

@Sean-Der
Copy link
Author

Sean-Der commented Jul 9, 2023

Hey @Warchamp7 I coded up a implementation of this if you want to try it out! Sean-Der/obs-studio#2

It adds a check box to enable/disable Simulcast
image

You can use it against

You will have a drop down to switch between your different quality levels. For quicker switching between layers you can set the Custom Encoder Settings of keyint=30 aq-mode=0 subme=0 no-deblock sync-lookahead=3. This should be handled better server side, but I am trying to keep Broadcast Box as simple as possible.


What needs to be able to be configured on a layer for this to be valuable both for users and services? Resolution, bitrate, anything else?

I personally think a simple checkbox is enough for a first version. In the future I would like to see an advanced mode where more can be configured. In the vast majority of cases I think streamers want uniformity.

Do we need a way to limit or recommend layer settings per users based on service? Or are services expected to serve whatever layers they receive?

This is discovered at connect. My plan is to disconnect/reject users who have configured their client incorrectly. In their stream manager view they will get a notification why. I want to handle this the same way as a user sending excessive bitrate.

A Open Source book on how WebRTC works is available if you are curious about the details! WebRTC for the Curious. If you have any specific questions I would love to answer them :)

This is a significant performance cost we will need to make clear to users for one thing. This is a problem that will need to be solved, potentially with work from NVIDIA/AMD

Why do you believe this will be a significant performance cost? If you do conferencing in your browser you have used Simulcast (Hangouts, Jitsi...). LiveKit wrote an article about how the industry sees it.

On my local machine my CPU usage goes from 5% -> 8% with Simulcast enabled with x264.

This is a significant performance cost we will need to make clear to users for one thing. When we are only spinning up a single encoder, this is a binary problem.

What does OBS do today with encoding/scaling/compositing costs are high? Do we have any automated tools that adjust configurations/help users debug? I don't think Simulcast is a unique situation. The existing situation isn't binary either. The performance of a single encoder is influenced by what you are encoding, how much you are encoding and the settings you are using.

@voluntas
Copy link

voluntas commented Jul 13, 2023

Thanks for the great suggestions. We will be working on supporting this feature in our products. Since you are here, please allow me to join the discussion.

Simulcast is a WebRTC (WHIP) specific feature, so I think some people may feel uncomfortable if it is in the "Streaming" section.

I think it would be easier to understand if a checkbox for Simulcast is provided in the "WHIP" setting section, since I think it is a setting for whether or not a=simulcast is included in the client's Offer.

@Sean-Der
Copy link
Author

Great suggestion @voluntas!

I have moved it. New builds from my PR now have it on the Stream tab.
Screenshot 2023-07-13 at 10 43 27 AM

@chhofi
Copy link

chhofi commented Jul 13, 2023

Hey @Sean-Der, saw your post on LinkedIn and wanted to try this great new implementation. OBS seems to stream to the server, but unfortunately I just see a spinning wheel... But somehow the simulcast got identified because the quality level option get displayed. Any suggestions how I can further debug this issue ?
Bildschirmfoto 2023-07-13 um 23 24 39

@Fenrirthviti
Copy link
Member

Hey @Sean-Der, saw your post on LinkedIn and wanted to try this great new implementation. OBS seems to stream to the server, but unfortunately I just see a spinning wheel... But somehow the simulcast got identified because the quality level option get displayed. Any suggestions how I can further debug this issue ?

This is an RFC, not a place to post for support, nor should this be considered an implementation that is ready for actual testing past the design in OBS at this stage.

Please do not solicit support feedback on this RFC, only design.

@Sean-Der
Copy link
Author

Hey @chhofi

I would love to help! Mind moving conversation to Sean-Der/obs-studio#2?

@chhofi
Copy link

chhofi commented Jul 14, 2023

@Fenrirthviti all right. Thanks for the clarification. @Sean-Der Sure, thx :)

@voluntas
Copy link

@Sean-Der Perfect! Wonderful.

@murillo128
Copy link

The UX/UI could be very similar to what twitch is proposing for its enhanced streaming:

image, by setting a maximum bitrate and a number of reserved encoded instances.

The SDP offer will use the number of reserved encoded instances and the server will be able to restrict the number on the sdp answer which will be the final number of encoder instances to use.

@murillo128
Copy link

Btw, the screenshot above has been taken without modifications from this repo code https://github.com/amazon-contributing/upstreaming-to-obs-studio/tree/30.0.2-enhanced-broadcasting-v11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants