New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pubsublite: Documentation Clarification #3814
Comments
Hi Stephen, Thanks for reporting this. We will improve the documentation and samples to clarify what this means and how to handle it. The publisher implementation within the library attempts to retry publishes upon recoverable errors, so if it terminates that means a fatal error must have occurred. We expect this to be rare in operation. In this case, a new publisher object must be created via |
Thanks for the response. Just asking the below mostly so you can cover some likely questions in your docs. So, to be clear, it's not really " Once the
In my other issue #3813 those errors are coming through this flow.
Clearly the simple createPubSubLitePublisher() call is not sufficient, but it sounds like you are also saying that not every "err" that is not nill on this path is fatal, yes? How can we differentiate? |
Sorry, I can see how my last comment was misleading. The library determines what is an unrecoverable error and we do not encourage user code to decide which is fatal. When the publisher decides to terminate, human intervention is normally necessary. Having said that, there are a couple of cases that user code can handle automatically:
For case 2, would it be more helpful to return a defined error, i.e. we add a
The messages are buffered internally within the |
We use PubSub and PubSubLite for tracking analytics events. As such, our messages really originate at this service from our perspective. Each message is enriched at this service with metadata, and the events originate outside our servcies, etc. If the assumption is that some other services "persists" them, I think that assumption is too broad. We would prefer to pipe messages to another service for downtime, and quickly, like Fluentd off machine, but the expectation that they are persisted somewhere would only be in local memory and since we've already put them into PubSubLite client buffers the logical place is to turn around and "extract" them from those buffers and push into the failover service (gcs files, fluentd, etc.) then have a service get those back out when the service recovers. I'd rather allow non fatal issues like the backend service being down have a max memory to buffer and a method to drain that buffer into a failover service. Given the rate of messages, I would like that service to be something that can go to disk, like FluentD. Then when the service returns FluentD can be drained back into the live service. If we rely on the capability to monitor and drain the buffers when in some state then all of the other capabilities in GCS become available for failure and retry. I'm spitballing a bit in actual implementation, but I think a method to find the messages that have failed. Remove them from retry and push them somewhere else all within the context of the service makes a lot of sense. Especially in cases where the message originate at the client, which would likely be many IOT and similar types of services. In these cases message durability is important, getting the message to PubSubLite eventually is the key. Given message volumes, we generate many gigs per minute, a method that allows a service to push ultimately to disk is almost certainly needed or the Golang service will OOM pretty readily unless gobs of memory are available to account for these relatively rare events, and the typical escape valve there would be Google-FluentD or similar. |
Thank you for explaining your use case. We can certainly make service unavailable a defined error to make it easier to detect and handle. As for providing access to the unsent, buffered messages, we will discuss this feature request internally and get back to you. |
Hi Stephen, An update on this issue - I have released v0.8 of the pubsublite library. Once pkg.go.dev/cloud.google.com/go/pubsublite updates, it will have some documentation clarifications. To address backend unavailability, we have introduced a We decided not to return the unsent, buffered messages and will leave it up to clients to track and handle. For example, perhaps you can wrap the I will close this issue, but feel free to file new issues for additional feature requests and unclear documentation that you find. Thank you! |
The PubSubLite documentation lacks a touch of clarity around "recreating a publisher".
What exactly does it mean: "A new publisher must be created to republish failed messages." Is that just replace the object like this being called again? Does everything just hook up once that is done?
Is there a "reset" or "resend old" command on the object? Does a new secondary publisher need to be made and hooked up to the "bucket" of unsent messages? Is it possible to get an example of what this really means for an operational app?
The text was updated successfully, but these errors were encountered: