Deadlock with multiple simultaneous requests from the same client #72

kevpar · 2020-11-26T09:58:09Z

There is a possible deadlock in the TTRPC server/client interactions when there are multiple simultaneous requests from the same client connection. This causes both the server and client handler goroutines to deadlock.

I've repro'd this on both Linux (with unix sockets as the transport) and Windows (with both unix sockets and named pipes as the transport). It repros more easily when the transport has less buffering, and when there are more goroutines sending requests concurrently from the client.

I intend to look into how this can be fixed, but filing an issue for awareness and in case someone else wants to tackle it in the meantime. :)

Stacks

Server

goroutine 138 [IO wait]:
internal/poll.runtime_pollWait(0x7f63e0790f08, 0x77, 0xffffffffffffffff)
	/go1.14.3/src/runtime/netpoll.go:203 +0x55
internal/poll.(*pollDesc).wait(0xc00018c198, 0x77, 0x31000, 0x31012, 0xffffffffffffffff)
	/go1.14.3/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitWrite(...)
	/go1.14.3/src/internal/poll/fd_poll_runtime.go:96
internal/poll.(*FD).Write(0xc00018c180, 0xc0003e2ff6, 0x31012, 0x31012, 0x0, 0x0, 0x0)
	/go1.14.3/src/internal/poll/fd_unix.go:276 +0x290
net.(*netFD).Write(0xc00018c180, 0xc0003e2ff6, 0x31012, 0x31012, 0x1000, 0x0, 0x0)
	/go1.14.3/src/net/fd_unix.go:220 +0x4f
net.(*conn).Write(0xc000010018, 0xc0003e2ff6, 0x31012, 0x31012, 0x0, 0x0, 0x0)
	/go1.14.3/src/net/net.go:196 +0x8e
bufio.(*Writer).Write(0xc000090000, 0xc0003e2ff6, 0x31012, 0x31012, 0xa, 0xf00032008, 0x2)
	/go1.14.3/src/bufio/bufio.go:623 +0x13b
github.com/containerd/ttrpc.(*channel).send(0xc000090100, 0x20000000f, 0xc0003e2000, 0x32008, 0x32008, 0x0, 0x0)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/channel.go:127 +0xbc
github.com/containerd/ttrpc.(*serverConn).run(0xc000094230, 0x6bf120, 0xc0000a2010)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/server.go:459 +0x64b
created by github.com/containerd/ttrpc.(*Server).Serve
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/server.go:127 +0x288

goroutine 139 [select]:
github.com/containerd/ttrpc.(*serverConn).run.func1(0xc0001000c0, 0xc000094230, 0xc000100180, 0xc000090100, 0xc000100120, 0xc000076540)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/server.go:404 +0x69b
created by github.com/containerd/ttrpc.(*serverConn).run
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/server.go:332 +0x2c0

Client

goroutine 19 [IO wait]:
internal/poll.runtime_pollWait(0x7fd97990df18, 0x77, 0xffffffffffffffff)
	/go1.14.3/src/runtime/netpoll.go:203 +0x55
internal/poll.(*pollDesc).wait(0xc0000f6098, 0x77, 0x31000, 0x31021, 0xffffffffffffffff)
	/go1.14.3/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitWrite(...)
	/go1.14.3/src/internal/poll/fd_poll_runtime.go:96
internal/poll.(*FD).Write(0xc0000f6080, 0xc0009ecff6, 0x31021, 0x31021, 0x0, 0x0, 0x0)
	/go1.14.3/src/internal/poll/fd_unix.go:276 +0x290
net.(*netFD).Write(0xc0000f6080, 0xc0009ecff6, 0x31021, 0x31021, 0x1000, 0x0, 0x0)
	/go1.14.3/src/net/fd_unix.go:220 +0x4f
net.(*conn).Write(0xc0000a8028, 0xc0009ecff6, 0x31021, 0x31021, 0x0, 0x0, 0x0)
	/go1.14.3/src/net/net.go:196 +0x8e
bufio.(*Writer).Write(0xc0000f8040, 0xc0009ecff6, 0x31021, 0x31021, 0xa, 0x1500032017, 0x1)
	/go1.14.3/src/bufio/bufio.go:623 +0x13b
github.com/containerd/ttrpc.(*channel).send(0xc0000f8080, 0x100000015, 0xc0009ec000, 0x32017, 0x32017, 0x0, 0x0)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/channel.go:127 +0xbc
github.com/containerd/ttrpc.(*Client).send(0xc0000f6280, 0x100000015, 0x63b500, 0xc000022480, 0x1, 0x0)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/client.go:324 +0x86
github.com/containerd/ttrpc.(*Client).run(0xc0000f6280)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/client.go:273 +0x5ab
created by github.com/containerd/ttrpc.NewClient
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/client.go:94 +0x1bd

goroutine 6 [select]:
github.com/containerd/ttrpc.(*receiver).run(0xc00000e080, 0x6bf0e0, 0xc0000f8000, 0xc0000f8080)
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/client.go:222 +0x241
created by github.com/containerd/ttrpc.(*Client).run
	/go/pkg/mod/github.com/containerd/ttrpc@v1.0.2/client.go:262 +0x1fa

Analysis

Basically, the server has a "receiver" goroutine that receives new requests from the transport, and sends a message via channel to the "worker" goroutine. The "worker" goroutine has a loop and a select to handle either a new request message, or a response that needs to be sent to the client. When the deadlock occurs, the server is stuck blocking on a response write to the transport from the "worker" goroutine, while the "receiver" goroutine is stuck trying to send a request message to the "worker" goroutine.

The client side is basically the inverse of this, where the "receiver" goroutine is stuck trying to send a response message received on the transport to the "worker" goroutine via channel. The "worker" goroutine is likewise stuck trying to send a new request to the server via the transport.

This looks like it should only occur when the connection is busy enough that the transport buffer is filled, as otherwise the server and client writes to the transport would simply be fulfilled by the buffer, and would not block waiting for a reader on the other end.

The interesting places in the code where the 4 goroutines are stuck are linked below:
Server receiver sending message to worker: https://github.com/containerd/ttrpc/blob/v1.0.2/server.go#L404
Server worker writing response to transport: https://github.com/containerd/ttrpc/blob/v1.0.2/server.go#L459
Client receiver sending message to worker: https://github.com/containerd/ttrpc/blob/v1.0.2/client.go#L222
Client worker writing request to transport: https://github.com/containerd/ttrpc/blob/v1.0.2/client.go#L273

Sample

I have a repro program here. This program can be run as either a server (go run . server) or client (go run . client). The server implements a very simple TTRPC server that listens for connections. The client spawns multiple goroutines to constantly send requests to the server and print their ID each time they get a response. Each request/response has a bunch of junk data added to the message to try to avoid the affects of buffering on the underlying transport. When run, you will generally see a little bit of output from the client, but then it will stop when the deadlock occurs. You can also hit enter on either the server or client to cause them to dump their current goroutine stacks to a file.

The text was updated successfully, but these errors were encountered:

kevpar · 2020-11-26T10:01:11Z

We encountered this with our containerd shim on Windows, where we use a single shim for every container in a pod. It was hit when publishing multiple events back to containerd via TTRPC simultaneously.

This pulls in a new version of github.com/containerd/ttrpc from a fork to fix the deadlock issue in containerd/ttrpc#72. Will revert back to the upstream ttrpc vendor once the fix is merged (containerd/ttrpc#73). Signed-off-by: Kevin Parsons <kevpar@microsoft.com>

This pulls in a new version of github.com/containerd/ttrpc from a fork to fix the deadlock issue in containerd/ttrpc#72. Will revert back to the upstream ttrpc vendor once the fix is merged (containerd/ttrpc#73). This fix also included some vendoring cleanup from running "vndr". Signed-off-by: Kevin Parsons <kevpar@microsoft.com>

Changes the TTRPC client logic so that sending and receiving with the server are in completely independent goroutines, with shared state guarded by a mutex. Previously, sending/receiving were tied together by reliance on a coordinator goroutine. This led to issues where if the server was not reading from the connection, the client could get stuck sending a request, causing the client to not read responses from the server. See [1] for more details. The new design sets up separate sending/receiving goroutines. These share state in the form of the set of active calls that have been made to the server. This state is encapsulated in the callMap type and access is guarded by a mutex. The main event loop in `run` previously handled a lot of state management for the client. Now that most state is tracked by the callMap, it mostly exists to notice when the client is closed and take appropriate action to clean up. [1] containerd#72 Signed-off-by: Kevin Parsons <kevpar@microsoft.com>

Changes the TTRPC client logic so that sending and receiving with the server are in completely independent goroutines, with shared state guarded by a mutex. Previously, sending/receiving were tied together by reliance on a coordinator goroutine. This led to issues where if the server was not reading from the connection, the client could get stuck sending a request, causing the client to not read responses from the server. See [1] for more details. The new design sets up separate sending/receiving goroutines. These share state in the form of the set of active calls that have been made to the server. This state is encapsulated in the callMap type and access is guarded by a mutex. The main event loop in `run` previously handled a lot of state management for the client. Now that most state is tracked by the callMap, it mostly exists to notice when the client is closed and take appropriate action to clean up. Also did some minor code cleanup. For instance, the code was previously written to support multiple receiver goroutines, though this was not actually used. I've removed this for now, since the code is simpler this way, and it's easy to add back if we actually need it in the future. [1] containerd#72 Signed-off-by: Kevin Parsons <kevpar@microsoft.com>

TestMultipleVMs_Isolated is still unstable (see firecracker-microvm#581). Apparently having multiple simultaneous requests from the same client is known to be problematic (see containerd/ttrpc#72). This commit workarounds the issue by making a client per VM. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>

kzys · 2022-04-30T00:26:28Z

Even after #94, sending multiple simultaneous requests from the same client seems problematic. I would port @kevpar's repro (or the reduced version of firecracker-containerd's problematic test) to this repository.

dcantah · 2024-01-29T22:40:35Z

What's the status on this?

jsturtevant · 2024-03-12T20:07:21Z

~~Was this resolved with #94?~~

I see @kzys comment now 🤦 #72 (comment)

kevpar · 2024-05-11T04:41:28Z

I recently had reason to dig into this again. Originally, the deadlock issue was fixed via #94.

The server used a central goroutine which dispatched requests to handlers, and sent responses back to the client. There was a separate goroutine which read requests from the pipe and sent them to the central goroutine. This meant that if the client did not read its responses from the pipe, the server would block and not read new requests. This was reasonable as a way to put back-pressure on a client that was not behaving properly.
The client was fixed in client: Handle sending/receiving in separate goroutines #94 so that it still read and handled responses even when writing new requests to the pipe was blocked.

Together, these function fine together, and there are no (known to me) deadlocks.

Then, everything changed when #107 was merged.

Now, the server reads requests and dispatches them to handlers in one goroutine, and sends responses to another goroutine to send them back to the client. this means that if the client does not read its responses from the pipe, the server will keep accepting new requests. Personally, my view is that the approach to put back-pressure on the client is better, but that's not tangential to this issue.
The change to introduce streaming into the client effectively reverted the fix in client: Handle sending/receiving in separate goroutines #94. Now, we are back to a client implementation where blocking on writing requests to the pipe can prevent the client from continuing to read responses from the pipe.

With #107, there are no deadlocks as long as both client and server are using post-#107 versions.

However, if the server is pre-#107, and the client is post #107 (or pre-#94), there is once again a chance of deadlock. More broadly, a robust client should not rely on specific server implementation details to avoid deadlocking. We must ensure the client can never block receiving responses if request writing is blocked on the pipe.

The specifics leading to the deadlock are as follows: There is shared state in the client between sending requests and receiving responses. This state is streams, a map of stream ID to stream object. Access to this state is guarded by the mutex streamLock. Access to write a request to the pipe is likewise guarded by the mutex sendLock. When sending a request, the following events take place:

First streamLock is taken so that a stream object can be set up for the request.
Then, sendLock is taken.
streamLock is released.
Finally, the request is written to the pipe.

https://github.com/containerd/ttrpc/blob/main/client.go#L405-L407

The reason for the deadlock is the ordering of steps 2 and 3 above. Because sendLock is taken before streamLock is released, you can have the following situation:

Goroutine 1 is holding sendLock and is blocked sending a request on the pipe (https://github.com/containerd/ttrpc/blob/main/client.go#L409).
Goroutine 2 is holding streamLock and is blocked trying to take sendLock (https://github.com/containerd/ttrpc/blob/main/client.go#L405).
Goroutine 3 is the client's receiver goroutine, and is blocked trying to take streamLock as part of processing an incoming response (https://github.com/containerd/ttrpc/blob/main/client.go#L424).

At this point, the only way they can make progress is for Goroutine 1's write to the pipe to complete. But as we said above, it's reasonable for the server to not keep reading requests if the client is not reading responses quickly enough!

Luckily, the fix is straightforward. We just need to flip steps 2 and 3, so that we release streamLock before we take sendLock.

I have repro'd the hang with a simple ttrpc client and server. The client spins up 100 goroutines that spam the server with requests constantly. After a few seconds of running I can see it hang. I have set the buffer size for the pipe to 0 to more easily repro, but it would still be possible to hit with a larger buffer size (just may take a higher volume of requests or larger payloads).

@kiashok FYI

kevpar · 2024-05-11T05:11:21Z

I have published #168 with a fix for this.

kevpar · 2024-05-15T18:06:05Z

This is now fixed in v1.2.4.

kevpar mentioned this issue Dec 7, 2020

client/server: Don't block the main connection loop for transport IO #73

Closed

kevpar mentioned this issue Dec 7, 2020

Re-vendor dependencies jterry75/cri#89

Merged

jsturtevant mentioned this issue Aug 3, 2021

Concurrent calls to collect stats on Windows using containerd hangs the metrics endpoint kubernetes/kubernetes#104111

Closed

kevpar mentioned this issue Oct 14, 2021

client: Handle sending/receiving in separate goroutines #94

Merged

dcantah mentioned this issue Dec 10, 2021

Getting stuck on crictl stats containerd/containerd#6362

Open

kzys mentioned this issue Apr 29, 2022

Make TestMultipleVMs_Isolated stable by not sharing its ttrpc client firecracker-microvm/firecracker-containerd#616

Merged

kate-goldenring mentioned this issue Mar 12, 2024

WASM Pod terminating stack in Kubernetes containerd/runwasi#418

Closed

kevpar mentioned this issue May 13, 2024

Fix deadlock when writing to pipe blocks #168

Merged

kevpar closed this as completed May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlock with multiple simultaneous requests from the same client #72

Deadlock with multiple simultaneous requests from the same client #72

kevpar commented Nov 26, 2020

kevpar commented Nov 26, 2020 •

edited

kzys commented Apr 30, 2022

dcantah commented Jan 29, 2024

jsturtevant commented Mar 12, 2024 •

edited

kevpar commented May 11, 2024

kevpar commented May 11, 2024

kevpar commented May 15, 2024

Deadlock with multiple simultaneous requests from the same client #72

Deadlock with multiple simultaneous requests from the same client #72

Comments

kevpar commented Nov 26, 2020

Stacks

Server

Client

Analysis

Sample

kevpar commented Nov 26, 2020 • edited

kzys commented Apr 30, 2022

dcantah commented Jan 29, 2024

jsturtevant commented Mar 12, 2024 • edited

kevpar commented May 11, 2024

kevpar commented May 11, 2024

kevpar commented May 15, 2024

kevpar commented Nov 26, 2020 •

edited

jsturtevant commented Mar 12, 2024 •

edited