Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative using binary instead of json for updates #455

Open
vincentfretin opened this issue Mar 2, 2024 · 3 comments
Open

Alternative using binary instead of json for updates #455

vincentfretin opened this issue Mar 2, 2024 · 3 comments

Comments

@vincentfretin
Copy link
Member

p2pcf

https://webspaces.space by @gfodor is using a fork of hubs's networked-aframe here with a specific P2PCFAdapter.js adapter using p2pcf to be hosted only with a Cloudflare worker for the signaling part. This adapter falls into the WebRTC mesh topology like easyrtc, not sfu like the janus adapter.

Note: networked-aframe repo in networked-aframe organization has all the changes from hubs's fork and more examples and fixes for easyrtc adapter and networked-video-source component for aframe 1.5.0.

Looking at the diff there are a lot of changes to use flatbuffers instead of json for naf updates if I understand it.
I'm not sure if one can just switch their own naf project that uses custom registered networked schema to this fork, @gfodor can shed some lights if this is possible or not and correct me if I'm saying something wrong technichally here.
Anyway I think it's an interesting code base to learn, so I'm creating this issue for more visibility.

I guess using binary transfer instead of json is better for bandwidth and cpu because there is no json serialization/deserialization.
It would be interesting if someone know about some studies or benchmarks on this subject.
For 8 persons in peer to peer it probably doesn't make much of a difference. For 30 persons or more using a sfu it probably can start to make a difference for the browser on low end devices? For the server part it probably also mean it can support a lot more rooms if it's using less cpu.

wsrtc

Another interesting code base is webaverse's wsrtc that only uses WebSocket and transfer audio over it, using opus codec in WebAssembly, and transferring other data in binary. Below is my analysis when I studied this code in Aug 2023 but I never done anything with it.

The index.js file in the wsrtc repo to run a simple nodejs server with express and ws wasn't updated, the webaverse project has its own integration, search for wsrtc in the webaverse app repo
This wsrtc code is not documented, and only used in the webaverse project as dependency.
I thought at first wsrtc was using WebCodecs with opus codec for audio but actually this code is commented and an alternative implementation based on a WebWorker with libopus in WebAssembly and AudioWorklet is used to encode and decode audio chunks with the opus codec, instead of using MediaStreamTrackProcessor/AudioEncoder/AudioDecoder from WebCodecs API that is not yet available in all browsers.
Encoded audio chunks are sent with a WebSocket opened with binaryType arraybuffer (instead of default blob).
The protocol implemented here to transfer other types like number, string, Uint8Array is well thought, all is encoded in bytes to be transferred over the same WebSocket. More complex state with Map and Array for example for the listing of players is using zjs that is a fork of yjs but optimized for bytes transfer.
This solution should work in all browsers, constraint I can see is the use of TextEncoder.encodeInto that is available since Safari iOS 14.5 and Safari 14.1 on macOS.

The solution here doesn't support screen sharing with a video codec. Recreating a video codec in WebAssembly in a efficient way would be really too much work.
I see in zjs repo that support for image in encoder/decoder was added meaning it can send ImageData objects over the WebSocket. ImageData can be obtained by drawing HTMLVideoElement to a canvas, so some sort of degraded screen sharing with an image updated every few seconds may be implemented, to be tested.

@gfodor
Copy link
Contributor

gfodor commented Mar 5, 2024

Heya, the reason I converted NAF to use flatbuffers was because when profiling the elixir servers for Hubs, for large events, the two major CPU bottlenecks on the server was the TLS encryption (unavoidable) and the JSON payload parsing. This codebase is a derivative of work I did previously that was trying to optimize this server side bottleneck, and I carried it forward to this adapter. Certainly binary serialization leads to more efficient use of the client CPU as well, but I don't think that would have been a sufficiently motivating reason to do it for a p2p library.

@gfodor
Copy link
Contributor

gfodor commented Mar 5, 2024

Wrt using websockets for audio I would def caution against that, as the head of line blocking on spotty networks I would imagine would be very problematic for maintaining quality and avoiding stutters. If you wanted to go this route you could leverage WebTransport instead and use unreliable delivery streams.

@vincentfretin
Copy link
Member Author

Thanks for the historical reasons. Hubs is still using the Elixir Jason library for json parsing today. I saw recently on twitter that someone did optimizations for json parsing in erlang with better perf than Jason. I couldn't find again the discussion. But yeah you still parse json.

I'm wondering how rust serde library that is used in the janus sfu compare with Elixir Jason library. We are several still using this solution without any additional backend.
For those that don't know, janus sfu was the previous sfu solution in Hubs, before they switch to the "dialog" service using nodejs/mediasoup WebRTC stack. Before Hubs switched to mediasoup, it used janus-plugin-sfu and naf-janus-adapter 3.0.x but with the Phoenix Channel/Presence (WebSocket) for the signaling part and data, using a function for the adapter.reliableTransport and adapter.unreliableTransport (those can be "websocket", "datachannel", or a function), so using only the janus sfu for audio and video, not data. There were changes in naf-janus-adapter master with the syncOccupants api to better handle the signaling part for Phoenix to be the source of truth of the connected participants, avoiding some ghost participants issues. I'm not sure if those changes were in production, but the Hubs team worked on mediasoup adapter without the signaling part soon after.

Yes I agree, WebSocket is not meant to stream media. WebTransport and WebCodecs would be a nice alternative to WebRTC when it will be supported in all browsers including Safari iOS.
I looked into this for a client that had heavy restriction to open ports on their corporate network. In the end they could do WebRTC via a TURN server via a relay-tls ice candidate. From my understanding relay-tcp or relay-tls have the head of line blocking issue. relay-udp would have been fine, but they blocked all UDP traffic. WebTransport is using UDP because it's using HTTP/3 over QUIC if I read correctly, so unless the client allows UDP ports, it won't work either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants