You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The YDocWebSocketHandler can be dramatically simplified and reduced in complexity. We may also be able to remove the dependency on pycrdt-websocket.
Context: The Yjs protocol
Here I present a brief overview of the Yjs protocol for context.
Within the Yjs protocol, there are two protocol layers:
Sync protocol layer -- handles synchronization of real-time document contents (i.e. the Ydoc object)
Awareness protocol layer -- handles synchronization of real-time collaborator state (e.g. cursor position)
Since jupyter_collaboration isn't a document peer itself, it doesn't need to use the awareness protocol layer, we just need to broadcast awareness messages to all peers when we receive one.
The sync protocol, in turn, has 3 message types:
Sync step 1 (SS1): A message that both 1) retrieves missing updates from another peer, and 2) instructs the other peer to stream all of their updates to this peer.
Sending a SS1 message to a peer is analogous to subscribing to that peer.
Sync step 2 (SS2): A message containing missing updates for a peer, as a reply to a SS1 message.
Sync update (SU): A message containing a transaction committed by a peer, after a connection was established with a SS1 + SS2 handshake.
Suppose we have the simplest Yjs network with a single node named node A. Suppose node B connects to this network. Here is how the connection should be established according to the sync protocol layer:
Node B sends a SS1 message.
Node A receives the SS1 message, and replies with a SS2 message. Node A then sends a SS1 message as well to subscribe to their updates.
When Node B handles the SS2 message, this completes the subscription of Node B to Node A.
Node B receives the SS1 message, and replies with a SS2 message.
Node A receives the SS2 message.
When Node A handles the SS2 message, this completes the subscription of Node A to Node B.
After the Yjs connection is established through this process, Node A and Node B then stream SU messages to one another to synchronize document state.
Motivation and impact
The minimum requirements of YDocWebSocketHandler are fairly concise:
Load the Ydoc in memory based on some ID in the URL path (currently room ID)
On receiving SS1 messages:
Reply with a SS2 message, then send a SS1 message.
On receiving SS2 or SU messages:
Apply it to the Ydoc, then broadcast to all peers except the sender
On receiving awareness messages:
Inspect the awareness message to know which clients have joined / left the network (current implementation).
Broadcast to all peers except the sender.
However, the current handler implementation has some issues:
The handling of these Yjs messages is split across two repos and mostly defined in pycrdt-websocket.
The handling of these Yjs messages can mostly be done in the Tornado handler class directly.
jupyter_collaboration relies on a YRoom abstraction provided by pycrdt-websocket, which can be simplified as well.
The code relies on a lot of async abstractions (some of them provided by anyio), but the minimum requirements of the Yjs protocol do not seem to require them.
Unifying the source code in a single repo makes releasing this software much easier. This, along with reducing the complexity of the source, will allow for others to contribute to this project more easily. :hugging_face:
Proposal
I will open a draft PR that proposes a simplified implementation. I will focus on the following areas:
Remove the dependency on pycrdt-websocket, merging existing logic into the YDocWebSocketHandler directly
Simplify async code
Suggest a more intuitive interface for these classes
The text was updated successfully, but these errors were encountered:
Since jupyter_collaboration isn't a document peer itself, it doesn't need to use the awareness protocol layer, we just need to broadcast awareness messages to all peers when we receive one.
I disagree, the backend is just another CRDT peer. But it's also an authority, and it should not blindly broadcast awareness messages. Imagine a client that changes their name in the awareness, for instance a student using the teacher's name, this should not be allowed. The backend has the authority to not accept these awareness messages if the awareness name doesn't match the authenticated user.
There is some logic that I plan to activate in the future.
On receiving SS2 or SU messages:
Apply it to the Ydoc, then broadcast to all peers except the sender
Wrong, updates must be echoed to the sender as well.
On receiving awareness messages:
Inspect the awareness message to know which clients have joined / left the network (current implementation).
Broadcast to all peers except the sender.
Same comment.
Unifying the source code in a single repo makes releasing this software much easier.
pycrdt-websocket is and should remain Jupyter-agnostic. It is the Python equivalent to y-websocket. I've not encountered any issue related to releasing.
I will open a draft PR that proposes a simplified implementation. I will focus on the following areas:
Remove the dependency on pycrdt-websocket, merging existing logic into the YDocWebSocketHandler directly
I suggest not spending time on this, as I will likely not accept this change.
Brief description
The
YDocWebSocketHandler
can be dramatically simplified and reduced in complexity. We may also be able to remove the dependency onpycrdt-websocket
.Context: The Yjs protocol
Here I present a brief overview of the Yjs protocol for context.
Within the Yjs protocol, there are two protocol layers:
Since
jupyter_collaboration
isn't a document peer itself, it doesn't need to use the awareness protocol layer, we just need to broadcast awareness messages to all peers when we receive one.The sync protocol, in turn, has 3 message types:
Suppose we have the simplest Yjs network with a single node named node A. Suppose node B connects to this network. Here is how the connection should be established according to the sync protocol layer:
After the Yjs connection is established through this process, Node A and Node B then stream SU messages to one another to synchronize document state.
Motivation and impact
The minimum requirements of
YDocWebSocketHandler
are fairly concise:However, the current handler implementation has some issues:
pycrdt-websocket
.jupyter_collaboration
relies on a YRoom abstraction provided bypycrdt-websocket
, which can be simplified as well.anyio
), but the minimum requirements of the Yjs protocol do not seem to require them.Unifying the source code in a single repo makes releasing this software much easier. This, along with reducing the complexity of the source, will allow for others to contribute to this project more easily. :hugging_face:
Proposal
I will open a draft PR that proposes a simplified implementation. I will focus on the following areas:
pycrdt-websocket
, merging existing logic into theYDocWebSocketHandler
directlyThe text was updated successfully, but these errors were encountered: