Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteBuf leak in LocalNetworkClient #1686

Open
zzmao opened this issue Nov 5, 2020 · 1 comment
Open

ByteBuf leak in LocalNetworkClient #1686

zzmao opened this issue Nov 5, 2020 · 1 comment
Assignees
Labels

Comments

@zzmao
Copy link
Contributor

zzmao commented Nov 5, 2020

It looks like LocalNetworkClient has some memory leak. @lightningrob @justinlin-linkedin

It can be constantly reproduced by testStitchGetUpdateDelete test.

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/zemao/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.25/110cefe2df103412849d72ef7a67e4e91e4266b4/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/zemao/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.5/6edffc576ce104ec769d954618764f39f0f0f10d/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[2020-11-04 22:30:43,925] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:43,997] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,003] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,009] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,009] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,010] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,010] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,017] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,017] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,320] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:44,854] INFO Creating cloud blob store for partition 3 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:44,863] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:44,992] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,001] INFO Creating cloud blob store for partition 0 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,005] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,103] INFO Creating cloud blob store for partition 6 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,111] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,146] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,147] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,149] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,149] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,150] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,250] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:45,276] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,276] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,277] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,277] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,277] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,278] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,279] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,279] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,280] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,280] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,280] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,280] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,287] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,298] INFO Creating cloud blob store for partition 3 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,307] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,318] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,363] INFO Creating cloud blob store for partition 6 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,368] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,422] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.AbstractByteBufAllocator.compositeHeapBuffer(AbstractByteBufAllocator.java:213)
	com.github.ambry.protocol.PutRequest.prepareBuffer(PutRequest.java:210)
	com.github.ambry.protocol.PutRequest.writeTo(PutRequest.java:229)
	com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
	com.github.ambry.network.LocalNetworkClient.sendAndPoll(LocalNetworkClient.java:83)
	com.github.ambry.network.CompositeNetworkClient.lambda$null$1(CompositeNetworkClient.java:97)
	java.util.concurrent.FutureTask.run(FutureTask.java:266)
	java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:45,422] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:385)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
	io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
	com.github.ambry.protocol.RequestOrResponse.prepareBuffer(RequestOrResponse.java:90)
	com.github.ambry.protocol.GetRequest.prepareBuffer(GetRequest.java:94)
	com.github.ambry.protocol.RequestOrResponse.writeTo(RequestOrResponse.java:98)
	com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
	com.github.ambry.network.LocalNetworkClient.sendAndPoll(LocalNetworkClient.java:83)
	com.github.ambry.network.CompositeNetworkClient.lambda$null$1(CompositeNetworkClient.java:97)
	java.util.concurrent.FutureTask.run(FutureTask.java:266)
	java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:45,423] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.AbstractByteBufAllocator.compositeDirectBuffer(AbstractByteBufAllocator.java:223)
	io.netty.buffer.AbstractByteBufAllocator.compositeDirectBuffer(AbstractByteBufAllocator.java:218)
	com.github.ambry.protocol.GetResponse.prepareBuffer(GetResponse.java:142)
	com.github.ambry.protocol.GetResponse.writeTo(GetResponse.java:156)
	com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
	com.github.ambry.network.LocalRequestResponseChannel.sendResponse(LocalRequestResponseChannel.java:71)
	com.github.ambry.protocol.AmbryRequests.sendGetResponse(AmbryRequests.java:774)
	com.github.ambry.protocol.AmbryRequests.handleGetRequest(AmbryRequests.java:386)
	com.github.ambry.protocol.AmbryRequests.handleRequests(AmbryRequests.java:122)
	com.github.ambry.protocol.RequestHandler.run(RequestHandler.java:47)
	java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:45,451] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,455] INFO Creating cloud blob store for partition 4 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,500] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,501] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,503] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,504] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,506] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,590] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:45,595] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,595] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,596] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,597] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,597] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,598] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,600] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,601] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,602] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,602] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,603] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,603] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,101] INFO Creating cloud blob store for partition 0 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,112] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,121] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,205] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,209] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,262] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,278] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records: 
Created at:
	io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:385)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
	io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
	com.github.ambry.protocol.RequestOrResponse.prepareBuffer(RequestOrResponse.java:90)
	com.github.ambry.protocol.DeleteRequest.prepareBuffer(DeleteRequest.java:75)
	com.github.ambry.protocol.RequestOrResponse.writeTo(RequestOrResponse.java:98)
	com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
	com.github.ambry.network.LocalNetworkClient.sendAndPoll(LocalNetworkClient.java:83)
	com.github.ambry.network.CompositeNetworkClient.lambda$null$1(CompositeNetworkClient.java:97)
	java.util.concurrent.FutureTask.run(FutureTask.java:266)
	java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:46,318] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,318] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,319] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,321] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:46,322] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:46,403] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:46,407] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,408] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,408] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,409] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,409] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,409] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,411] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,412] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,412] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,412] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,412] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,412] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,423] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,434] INFO Creating cloud blob store for partition 6 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,442] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,450] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,513] INFO Creating cloud blob store for partition 4 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,552] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,558] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,623] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,623] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,624] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,624] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:46,625] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)

Process finished with exit code 0

@lightningrob lightningrob self-assigned this Dec 21, 2020
@lightningrob
Copy link
Contributor

I'm able to reproduce this by running the CloudRouterTest, but I don't yet know the reason for the leak. The ByteBuf that is allocated in RequestOrResponse::prepareBuffer ought to be released in the NonBlockingRouter::run loop on this line:
responseInfoList.forEach(ResponseInfo::release);
I will need to investigate if there are code paths where this does not happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants