Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for context-aware connection #342

Closed
smotes opened this issue May 25, 2018 · 18 comments
Closed

Support for context-aware connection #342

smotes opened this issue May 25, 2018 · 18 comments

Comments

@smotes
Copy link

smotes commented May 25, 2018

Similar to the ConnWithTimeout interface that implements the Do/Receive methods with an extra time.Duration parameter, I think it would be useful to add in a ConnWithContext interface that takes an extra context.Context parameter. Many other Golang libraries are starting to adopt context.Context usage for cancellation since it was put into the standard library. Additionally, this proposed change shouldn't make any breaking changes to this package's public API. I am happy to implement this feature if I get the green light from whoever maintains this package - I just wanted to start up a discussion first per https://github.com/gomodule/redigo/blob/master/.github/CONTRIBUTING.md.

@smotes
Copy link
Author

smotes commented May 28, 2018

I plan to use it for handling timeouts when hitting the network with the client (in this case, when using Do).

@smotes
Copy link
Author

smotes commented May 28, 2018

I was not aware the connection was closed when a timeout error occurred, but I am seeing that locally now after some testing. Is that intended behavior in this package? Imho it seems a bit much to have to create a new connection each time one of its operations times out. And yes, the snippet you posted above is basically what I had in mind with the code additions.

@smotes
Copy link
Author

smotes commented May 28, 2018

I agree - in general it seems like the Redis client shouldn't close by default on cancellations/timeouts, but rather leave that to the calling code to handle the situation as it sees fit. It might be worth it to update the DoWithTimeout code too for the sake of consistency, assuming the context code makes it in.

Edit: The backwards compatibility of this change kind of worries me though, especially considering this package has quite a lot of users.

@smotes
Copy link
Author

smotes commented May 28, 2018

The https://github.com/juhanoi/radix client actually has a strict parameter it passes into its internal network read function which acts as a switch to determine if it should consider timeouts as critical errors or not (see https://github.com/mediocregopher/radix.v2/blob/master/redis/client.go#L178). It closes the connection when a timeout occurs and strict == true, but it seems to be disabled by default on the client's public API. It also exposes a public IsTimeout function to check an error value (see https://github.com/mediocregopher/radix.v2/blob/master/redis/resp.go#L826).

The client in the radix package is not thread-safe however, and so is intended to only be used within a single goroutine. You have to use the pool subpackage to get a new client instance from a limited pool to handle things otherwise. I figured this might be worth mentioning in case it's related to the connection sync topic you mentioned above, as I know this package handles the client thread safety via a single TCP connection guarded via mutex.

@garyburd
Copy link
Member

garyburd commented Jun 1, 2018

What is the problem you are trying to solve?

You said that you want handle timeouts. Read and write timeouts can be specified when dialing a connection and the application can override the read timeout on a per-call basis. Explain why this does not meet your needs.

It may be that adding a context argument is part of fixing the problem, but let's start with a discussion of the problem.

@smotes
Copy link
Author

smotes commented Jun 1, 2018

The problem I'm trying to solve is cancellation, where timeouts are just a special case. The context.Context interface exposes a Done() <-chan struct{} method to check when cancellation has occurred and Err() error to check the related error, but that error value doesn't necessarily have to be from a timeout (for instance, it could be via context.WithCancel after catching a SIGINT from the OS and gracefully stopping a goroutine).

@garyburd
Copy link
Member

garyburd commented Jun 1, 2018

Describe the specific problem you are trying to fix in your application. I know what context does. There's no need to repeat a description here.

@smotes
Copy link
Author

smotes commented Jun 1, 2018

I'm working with a multi-stage pipeline where stage 3 involves using Redis as a cache to detect duplicate values, but is dependent on other stages still running to continue. If another stage fails at some point during runtime, I'd like to be able to initiate early cancellation on all other stages without waiting for each of their long-running tasks to end.

@garyburd
Copy link
Member

garyburd commented Jun 1, 2018

Redis does not provide a way to cancel a command. Which Redis command is taking so long that it's a concern? Are you using one of the blocking commands?

@smotes
Copy link
Author

smotes commented Jun 1, 2018

No blocking commands, just basic stuff like GET and SETEX. It's more a matter of uniformity for me - where all other parts of my application take context.Context for cancellations, and I thought it would be a nice addition to this package's API.

@smotes smotes closed this as completed Jun 1, 2018
@smotes smotes reopened this Jun 1, 2018
@smotes
Copy link
Author

smotes commented Jun 1, 2018

Misclicked the close button when replying, sorry.

I am also setting timeouts as we discussed earlier, but currently I can already do that separately by just using the DoWithTimeout function. Rather, I was making a further argument for context.Context as it could be more generalized in my program's layout, since not all cancellations are timeouts.

@garyburd
Copy link
Member

garyburd commented Jun 1, 2018

Is there a Redis command that you need to cancel?

@garyburd
Copy link
Member

garyburd commented Jun 1, 2018

@redmondnoodle I do not recommend your proposed DoWithContext function. If the context timeout is less than the time needed to execute a command, then the connection times out and is no longer useable. If the context timeout is much greater than the expected time to execute the command, then it takes longer to detect that the server is down or the network connection is broken.

@garyburd
Copy link
Member

garyburd commented Jun 4, 2018

I am closing this issue because there's not a strong argument for a new feature. Although the application stages may be long running, individual calls to GET and SETEX are not. The application can terminate a stage without involving this package. The dial timeout options handle timeouts for GET and SETEX commands.

There's no need to change the API when there are no features to add.

@garyburd garyburd closed this as completed Jun 4, 2018
@pennersr
Copy link

@garyburd I have this use case: a goroutine is spawned waiting for something to happen by means of a "BRPOP key 0" (0 -- so no timeout). Now, as @smotes mentions above as well, I need to do a graceful shutdown after receiving SIGINT. The usual way of doing this is to use a context. Without having a context in place, how can the goroutine doing the BRPOP be properly signalled to stop what it is doing?

@garyburd
Copy link
Member

@pennersr See #351.

@pennersr
Copy link

@garyburd While the DoWithCancel() from #351 could be used to implement my use case, that approach does result in each of my Do() commands to have an extra overhead/latency for the additional Redis commands to obtain the client ID -- something that would not be necessary if there was "native" support for cancelling any ongoing command in a Redis connection.

FWIW, I have now implemented my use case by rather bluntly issuing a conn.Close() when the cancel is triggered. This aborts the ongoing Do("BRPOP"), albeit with an error:

2018/06/26 21:49:48 Error BRPOP:read tcp 172.19.0.6:58856->172.19.0.3:6379: use of closed network connection

@garyburd
Copy link
Member

Extra roundtrips to the server can be eliminated by caching the result of the CLIENT ID command in the low-level connection. I updated 351 to mention this.

Native support cannot eliminate the overhead of coordinating with a goroutine to wait on the context done channel. I proposed DoWithCancel() as a separate helper so that applications incur this extra overhead only where needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants