Particular configuration can cause mismatched responses #1113

patrickwhite256 · 2019-08-01T22:13:43Z

Using the ClusterClient with ReadOnly set and MaxRetries > 0 can lead to serious bugged client state.

In short, if a timeout or other retryable error occurs, the connection can buffer responses incorrectly, so you can get into a state where your code is:

msg, err := client.Echo("message").Result()
log.Println("msg:", msg) // should print "msg: message"
log.Println("err:", err) // should print: "err: <nil>"

can in fact print "msg: OK" and others.

This can lead to states where the client is unknowingly returning totally incorrect, but valid-looking responses with no error:

Within Redis server: <AAAA: "hello", BBBB: "world">
msg, err = client.Get("AAAA").Result() // "OK", <nil>
msg, err = client.Get("BBBB").Result() // "hello", <nil>

I've created https://github.com/patrickwhite256/goredis-bug-demo which contains a fake redis "server" and a script that can trip the bug. Because it's not a real Redis server, it's not proof positive of the bug, so I have also included a detailed explanation of what exactly needs to happen to trip the bug all the way down the stack.

I've opened #1112 to address this.

The text was updated successfully, but these errors were encountered:

vmihailenco · 2019-08-02T08:56:57Z

Hi,

I've not done as thorough check as you so please forgive me if I miss something.

But SingleConnPool is only created in single place to init the connection. If there are any errors the connection is removed from the pool. After net.Conn is initialized the SingleConnPool is discarded and is not used anywhere else.

So you probably confuse newConn and baseClient.newConn.

I also briefly looked at the program and there is no way the program that executes single command and exits can reproduce any bugs in buffering / connection pool. There should not be any buffers / connections next time you run it.

Does that make sense?

patrickwhite256 · 2019-08-02T17:10:54Z

Have you run the demo, or just looked? If you haven't tried it, I strongly encourage doing so.

That single place where SingleConnPool is used, while initializing the connection, is where the bug occurs - this is why the bug only happens when ReadOnly is enabled - the initialization is skipped otherwise. From the perspective of _getConn, there are no errors, so the connection is not removed from the pool. The README in the repository I linked has a nearly line-by-line explanation.

vmihailenco · 2019-08-04T11:42:59Z

Sorry, I've only looked at the program / description. BTW thanks - you've done big work - I appreciate it.

I've slightly changed your fix by providing more robust SingleConnPool implementation from https://github.com/go-pg/pg/blob/master/internal/pool/pool_single.go. Now your example reliably fails but ideally we should also retry by opening new connection. I will see what I can do later...

patrickwhite256 · 2019-08-05T19:41:07Z

Thanks for working with me on this. It would be helpful if you could release a new version / tag.

vmihailenco · 2019-08-08T12:00:09Z

Okay, I've tagged v7.0.0-beta that you should be able to use with Go Modules and semantic version importing import "github.com/go-redis/redis/v7". See https://github.com/golang/go/wiki/Modules if you are not familiar with Go Modules.

rcurrier666 · 2019-08-12T18:01:25Z

Any chance we can get this fix back ported to V6? Since we are seeing the problem in production we can't deploy a beta version of the package. Plus we are in the early days of converting to go-modules.

vmihailenco · 2019-08-13T14:25:48Z

Also tagged v6.15.4 with the fix for this issue

patrickwhite256 mentioned this issue Aug 1, 2019

Close single conn connection pool #1112

Merged

vmihailenco mentioned this issue Aug 3, 2019

Add proper SingleConnPool implementation #1114

Merged

vmihailenco mentioned this issue Aug 8, 2019

Feature/retry init conn #1116

Merged

vmihailenco closed this as completed in #1116 Aug 8, 2019

vmihailenco mentioned this issue Aug 13, 2019

Port SingleConnPool fix to v6 #1124

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Particular configuration can cause mismatched responses #1113

Particular configuration can cause mismatched responses #1113

patrickwhite256 commented Aug 1, 2019

vmihailenco commented Aug 2, 2019 •

edited

patrickwhite256 commented Aug 2, 2019 •

edited

vmihailenco commented Aug 4, 2019

patrickwhite256 commented Aug 5, 2019

vmihailenco commented Aug 8, 2019

rcurrier666 commented Aug 12, 2019

vmihailenco commented Aug 13, 2019

Particular configuration can cause mismatched responses #1113

Particular configuration can cause mismatched responses #1113

Comments

patrickwhite256 commented Aug 1, 2019

vmihailenco commented Aug 2, 2019 • edited

patrickwhite256 commented Aug 2, 2019 • edited

vmihailenco commented Aug 4, 2019

patrickwhite256 commented Aug 5, 2019

vmihailenco commented Aug 8, 2019

rcurrier666 commented Aug 12, 2019

vmihailenco commented Aug 13, 2019

vmihailenco commented Aug 2, 2019 •

edited

patrickwhite256 commented Aug 2, 2019 •

edited