Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in shutdown logic #7321

Open
tkaitchuck opened this issue Nov 21, 2023 · 0 comments
Open

Race condition in shutdown logic #7321

tkaitchuck opened this issue Nov 21, 2023 · 0 comments

Comments

@tkaitchuck
Copy link
Member

Sometimes when shutting down errors are generated. For example:

Nov 20, 2023 4:29:32 PM com.google.common.util.concurrent.ListenerCallQueue$PerListenerQueue run
SEVERE: Exception while executing callback: io.pravega.common.concurrent.Services$ShutdownListener@5986eba3 failed({from = STOPPING, cause = java.util.concurrent.TimeoutException: Timeout expired while waiting for the Service to shut down.})
java.util.concurrent.RejectedExecutionException: Task ThreadPoolScheduledExecutorService.ScheduledRunnable(id=83474, isDelayed=false, scheduledTimeNanos=0, task=java.util.concurrent.Executors$RunnableAdapter@2d979643[Wrapped task = io.pravega.segmentstore.server.reading.StreamSegmentReadIndex$$Lambda$1832/0x000000084074c440@184194bd], future=java.util.concurrent.CompletableFuture@73e93d13[Not completed]) rejected from java.util.concurrent.ThreadPoolExecutor@f79a760[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 25259]
	at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
	at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
	at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
	at io.pravega.common.concurrent.ThreadPoolScheduledExecutorService.execute(ThreadPoolScheduledExecutorService.java:247)
	at io.pravega.segmentstore.server.reading.StreamSegmentReadIndex.close(StreamSegmentReadIndex.java:176)
	at io.pravega.segmentstore.server.reading.ContainerReadIndex.closeIndex(ContainerReadIndex.java:425)
	at io.pravega.segmentstore.server.reading.ContainerReadIndex.lambda$0(ContainerReadIndex.java:434)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
	at io.pravega.segmentstore.server.reading.ContainerReadIndex.closeAllIndices(ContainerReadIndex.java:434)
	at io.pravega.segmentstore.server.reading.ContainerReadIndex.close(ContainerReadIndex.java:115)
	at io.pravega.segmentstore.server.containers.StreamSegmentContainer.close(StreamSegmentContainer.java:294)
	at io.pravega.segmentstore.server.store.StreamSegmentContainerRegistry.unregisterContainer(StreamSegmentContainerRegistry.java:203)
	at io.pravega.segmentstore.server.store.StreamSegmentContainerRegistry.handleContainerFailure(StreamSegmentContainerRegistry.java:194)
	at io.pravega.segmentstore.server.store.StreamSegmentContainerRegistry.lambda$2(StreamSegmentContainerRegistry.java:187)
	at io.pravega.common.concurrent.Services$ShutdownListener.failed(Services.java:138)
	at com.google.common.util.concurrent.AbstractService$5.call(AbstractService.java:562)
	at com.google.common.util.concurrent.AbstractService$5.call(AbstractService.java:559)
	at com.google.common.util.concurrent.ListenerCallQueue$PerListenerQueue.run(ListenerCallQueue.java:205)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at io.pravega.common.concurrent.ThreadPoolScheduledExecutorService$ScheduledRunnable.run(ThreadPoolScheduledExecutorService.java:197)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

This is caused by StreamSegmentContainer having two ways it is shutdown. One is via the close method which is ultimately invoked by ServiceBuilder's close on the containerRegistry. However in the StreamSegmentContainerRegistry it sets an "on stop" listener which will also shut down the container.
Thus the container will be shut down from two different threads in parallel. Depending on which one wins the lagging thread may encounter errors due to components already being shutdown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant