New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ember server fix frames #7059
base: series/0.23
Are you sure you want to change the base?
Ember server fix frames #7059
Conversation
Thanks, this is great! Any chance we can add some test coverage for at least one of these fixes? 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @armanbilge, a dedicated test for the fixed case would be highly appreciated.
@danicheg @armanbilge I've added some tests |
def writeFrame(frame: WebSocketFrame): F[Unit] = | ||
frameToBytes(frame).traverse_(c => timeoutMaybe(socket.write(c), idleTimeout)) | ||
|
||
def writeFrame(frame: WebSocketFrame): F[Unit] = { | ||
val bytes = frameToBytes(frame) | ||
timeoutMaybe(socket.write(bytes), idleTimeout) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes frames got corrupted when writing to a socket. This is because the frame headers and payload are not atomically written to the socket and the order can be changed.
Sorry, I'm having a hard time understanding this. How can the order be changed? Sequential calls to write
should be sent sequentially on the socket. If that's not the case, I suspect that there is a very serious bug with the JDK/OS.
Can you provide a test that reproduces this issue? I tried running the test you added "do not corrupt frames if handle them concurrently"
several times with the old code but it always passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, do you mean that concurrent writes can be interleaved? I think this is different from an ordering problem: the order is correct, but concurrent writes are being mixed up with each other.
http4s/core/shared/src/main/scala/org/http4s/websocket/FrameTranscoder.scala
Lines 141 to 144 in 43efe6d
Array[ByteBuffer](buff) | |
} else { | |
buff.flip | |
Array[ByteBuffer](buff, in.data.toByteBuffer) |
Now I see what you mean about the headers and the payload being separated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we just change the frame transcoder to copy the data instead the same, singular byte buffer, and return that? in.data.toByteBuffer
is probably copying anyway, so might as well copy it into a single buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@armanbilge Yes, you totally right - headers and payload for several frames can be mixed with each other.
For example:
Right case: [header1->payload1, header2->payload2]
Wrong case: [header1->header2, payload1->payload2]
Sure, we can change FrameTranscoder and return one byte buffer for frame in frameToBuffer
.
I will add a commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation @acteek!
Sorry, I'm actually still confused in what situation this can happen.
To create a WebSocket application, you use the WebSocketBuilder
. Specifically, to send frames to the client you run a Stream
. Because a Stream
is sequential, there should never be concurrent writes to the socket.
http4s/ember-server/jvm/src/test/scala/org/http4s/ember/server/EmberServerWebSocketSuite.scala
Line 66 in 44facfa
wsBuilder.build(sendReceive) |
I think this also explains why the test does not reproduce the issue.
Sometimes frames got corrupted when writing to a socket.
How did you discover this? It would be very helpful if we could see the original reproducer, so that we can solve the right problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acteek oh, are you using Ember Server on Node.js?
Stream is sequential until we do not run async effect to write to something through IO. After that, we are not able to guarantee element sequential.
This sounds like a bug then that maybe we need to fix in FS2. Writing to a socket should always be sequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, are you using Ember Server on Node.js?
Oh, Sorry, I use a JVM server. But I think the main cause is the same - socket.write
is an async function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@acteek got it, thanks for clarifying that!
But I think the main cause is the same -
socket.write
is an async function.
I'm not sure that's right—even though write is async, we backpressure until the write completes. But if that is right, then this is a serious issue in our Socket
layer, because we rely on this assumption not just in websockets, but also in the rest of Ember, and also Skunk, and any other libraries using FS2 socket. So everything would break and we need to fix the root issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @armanbilge.
Do you have a plan for merging this solution for broken frames? Or are we gonna try to fix it on FS2 side? I'd like to avoid local fixies for my project and use it as a library solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @acteek, sorry for letting this hang. As I described above, the symptoms you are describing sounds like a serious problem that would affect all projects using FS2 sockets. But without a reproducer of the issue it's not clear to me what the actual problem is. Is there any way you would be able to help with that?
Fixes #6639 Fixes #5128