Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to submit a listener notification task. Event loop shut down? #5795

Closed
hyeongguen-song opened this issue Apr 18, 2024 · 3 comments
Closed
Labels
Milestone

Comments

@hyeongguen-song
Copy link

Expected behavior
Should not occur

Actual behavior
Occurred

stack trace
java.util.concurrent.RejectedExecutionException: event executor terminated
	at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:934)
	at io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:351)
	at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:344)
	at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:836)
	at io.netty.util.concurrent.SingleThreadEventExecutor.execute0(SingleThreadEventExecutor.java:827)
	at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:817)
	at io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:862)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:500)
	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
	at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625)
	at io.netty.util.concurrent.DefaultPromise.setSuccess(DefaultPromise.java:97)
	at io.netty.channel.group.DefaultChannelGroupFuture.setSuccess0(DefaultChannelGroupFuture.java:200)
	at io.netty.channel.group.DefaultChannelGroupFuture.access$400(DefaultChannelGroupFuture.java:41)
	at io.netty.channel.group.DefaultChannelGroupFuture$1.operationComplete(DefaultChannelGroupFuture.java:75)
	at io.netty.channel.group.DefaultChannelGroupFuture$1.operationComplete(DefaultChannelGroupFuture.java:48)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
	at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625)
	at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105)
	at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:990)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:756)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:731)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:620)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.close(DefaultChannelPipeline.java:1352)
	at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:755)
	at io.netty.channel.AbstractChannelHandlerContext.access$1200(AbstractChannelHandlerContext.java:61)
	at io.netty.channel.AbstractChannelHandlerContext$11.run(AbstractChannelHandlerContext.java:738)
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
	at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1583)

Steps to reproduce or test case
Sorry, I don't know well.

Redis version
Elasticache Redis 7.1.0

Redisson version
3.28.0

Redisson configuration

    @Bean
    fun redissonClient(): RedissonClient {
        val config =
            Config().apply {
                useMasterSlaveServers()
                    .setTimeout(redisCommandTimeout.toInt())
                    .setConnectTimeout(redisConnectTimeout.toInt())
                    .setMasterAddress("redis://${redisReplicaProperties.master.host}:${redisReplicaProperties.master.port}")
                    .apply {
                        redisReplicaProperties.slaves.forEach { slave ->
                            addSlaveAddress("redis://${slave.host}:${slave.port}")
                        }
                    }
            }
        return Redisson.create(config)
    }
@MartinEkInsurely
Copy link
Contributor

We get the same issue on 3.29.0

From what I understand it comes from tasks being completed after EventExecutor shutdown and not being able to register execution completed events to their listeners.

When looking into it I found that there is a quietPeriod and a timeout for shutdowns, with hard coded configuration of 2 and 10 seconds respectively. From my understanding of the code this config is intended to let executing task complete before shutting down.

Further I found that the MasterSlaveConnectionManager effectively drops this config (sets it to 0) before calling shutdownGracefully() on the group.

timeoutInNanos is set to 0 in a couple of places through lines like this one

timeoutInNanos = Math.max(0, timeoutInNanos - System.nanoTime() - startTime);

My guess is that the intended code is

timeoutInNanos = Math.max(0, timeoutInNanos - (System.nanoTime() - startTime));

Instead of subtracting 2 times the current system nano time from the timeoutInNanos variable.

With timeoutInNanos set to 0, the following line also sets the quietPeriod to 0.

Something like this could be added to RedissonTest if my understanding is correct that this is broken. (The test fails on current master but succeeds with the change proposed above)

@Test
    public void testShutdownQuietPeriod() {
        long quietPeriod = TimeUnit.SECONDS.toMillis(1);
        long timeOut = quietPeriod + TimeUnit.SECONDS.toMillis(1);
        RedissonClient r = createInstance();
        long startTime = System.currentTimeMillis();
        r.shutdown(quietPeriod, timeOut, TimeUnit.MILLISECONDS);
        long shutdownTime = System.currentTimeMillis() - startTime;

        Assertions.assertTrue(shutdownTime > quietPeriod);
    }

(If I would have known how to and/or would have had the appropriate access, I would have opened a PR with the suggested change)

@mrniko mrniko added this to the 3.30.0 milestone May 14, 2024
@mrniko mrniko added the bug label May 14, 2024
@mrniko
Copy link
Member

mrniko commented May 14, 2024

@MartinEkInsurely

Thanks for pointing out. It's a bug. You can follow the instruction described in this article for creating a PR https://opensource.com/article/19/7/create-pull-request-github

@mrniko
Copy link
Member

mrniko commented May 15, 2024

Merged

@mrniko mrniko closed this as completed May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants