Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ember h2 client hangs after long running connection dies #7359

Open
nefilim opened this issue Jan 14, 2024 · 2 comments
Open

ember h2 client hangs after long running connection dies #7359

nefilim opened this issue Jan 14, 2024 · 2 comments
Labels
bug Determined to be a bug in http4s module:ember-client module:ember-core

Comments

@nefilim
Copy link

nefilim commented Jan 14, 2024

17:30:00.011 [io-compute-3] ERROR o.h.e.c.EmberClientBuilderCompanionPlatform  - ReadLoop has errored
java.io.IOException: Connection reset
	at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishRead(UnixAsynchronousSocketChannelImpl.java:425)
	at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(UnixAsynchronousSocketChannelImpl.java:195)
	at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(UnixAsynchronousSocketChannelImpl.java:217)
	at java.base/sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:306)
	at java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:113)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
	at delay @ fs2.io.net.SocketCompanionPlatform$AsyncSocket.$anonfun$readChunk$1(SocketPlatform.scala:120)
	at async @ fs2.io.net.SocketCompanionPlatform$AsyncSocket.readChunk(SocketPlatform.scala:114)
	at flatMap @ fs2.io.net.SocketCompanionPlatform$BufferedReads.$anonfun$read$1(SocketPlatform.scala:82)
	at delay @ fs2.io.net.SocketCompanionPlatform$BufferedReads.withReadBuffer(SocketPlatform.scala:52)
org.http4s.ember.core.h2.H2Connection$KillWithoutMessage
java.io.IOException: Connection reset
	at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishRead(UnixAsynchronousSocketChannelImpl.java:425)
	at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(UnixAsynchronousSocketChannelImpl.java:195)
	at java.base/sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(UnixAsynchronousSocketChannelImpl.java:217)
	at java.base/sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:306)
	at java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:113)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
	at delay @ fs2.io.net.SocketCompanionPlatform$AsyncSocket.$anonfun$readChunk$1(SocketPlatform.scala:120)
	at async @ fs2.io.net.SocketCompanionPlatform$AsyncSocket.readChunk(SocketPlatform.scala:114)
	at flatMap @ fs2.io.net.SocketCompanionPlatform$BufferedReads.$anonfun$read$1(SocketPlatform.scala:82)
	at delay @ fs2.io.net.SocketCompanionPlatform$BufferedReads.withReadBuffer(SocketPlatform.scala:52)

I have the following scenario:

long running h2 connection to a service with the following request:

val req = Request[F](
  Method.GET,
  Uri.unsafeFromString(s"https://${config.host}/eventstream"),
  httpVersion = HttpVersion.`HTTP/2`,
  headers = Headers(
    Header.Raw(ci"hue-application-key", config.apiKey),
    Accept(MediaType.`text/event-stream`),
  )
)

events are processed in a fs2 Stream. I have another fs2 Stream based on Queue for commands to the same service:

Stream
  .fromQueueUnterminated(bridgeQueues.commandQueue)
  .parEvalMap(2) { 
    rateLimited(semaphore, command => client.publishCommand(command).flatMap {
      case r =>
        logger.debug(s"publish Command response: $r")
    })
  }

Something goes wrong with the network attached to the host running this client, I can see a persistent connection (not http4s) to another service is detected as failed and recreated.
However the above exception is only thrown once a command is being submitted to the queue (and hence attempted to being sent - this could be an hour after the other persistent connection is being recreated) and once this exception is logged by http4s, the connection is dead (netstat shows nothing), the ember client unresponsive and no further SSE events are received until the service is restarted (the JVM process).
I've tried multiple disparate approaches but the behaviour is always the same.

@armanbilge armanbilge added bug Determined to be a bug in http4s module:ember-client module:ember-core labels Jan 14, 2024
@nefilim
Copy link
Author

nefilim commented Jan 14, 2024

I tried to distill an example:

https://gist.github.com/nefilim/b47805255431b00866c07f3375fa36ff

I try to simulate the problem by running this on a laptop and then putting it to sleep. Unfortunately it's not quite the same problem, the readLoop does error but so does the writeLoop, those errors does not get propagated.

The connection receives a GoAway which is propagated in the error channel and causes the stream to rebuild, the client recreated and the connection re-established, the desired behaviour.

It seems in my application, the readLoop errors leaving the client in a bad state but no further errors are being propagated to allow clean/recreation .. not quite sure how to simulate that.

@nefilim
Copy link
Author

nefilim commented Jan 18, 2024

When this happens I have two fibers waiting in the same place 🤔

cats.effect.IOFiber@23480178 WAITING
 ├ rethrow$extension @ fs2.Compiler$Target.$anonfun$compile$1(Compiler.scala:157)
 ├ flatMap @ org.http4s.ember.core.h2.H2Stream.getResponse(H2Stream.scala:413)
 ├ main$ @ org.home4s.server.Main$.main(Main.scala:28)
 ╰ main$ @ org.home4s.server.Main$.main(Main.scala:28)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Determined to be a bug in http4s module:ember-client module:ember-core
Projects
None yet
Development

No branches or pull requests

2 participants