New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spanner-client: Retry PDML on "Received unexpected EOS on DATA frame from server" #5209
Comments
(Not sure whether to count this as a bug or a feature request, but it probably doesn't matter much.) @skuruppu I've assigned this to you to start with, given that you've already looked at the Java - but I'm happy to take it on if you'd prefer. (I wouldn't be able to get to it for a few weeks though.) |
For long-running PDML queries (>= 30mins), there's a possibility that the gRPC connection is terminated with an error "Received unexpected EOS on DATA frame from server". We now retry the same transaction on this error. Fixes googleapis#5209
For long-running PDML queries (>= 30mins), there's a possibility that the gRPC connection is terminated with an error "Received unexpected EOS on DATA frame from server". We now retry the same transaction on this error. Fixes googleapis#5209
I should keep the discussion in the issue instead of the PR. @thiagotnunes let me know if you want to test this PR against your test dataset or whether I should do it. I won't be able to this week but happy to try next week. In case you want to do the test, you can use the Docker container by following the instructions here. |
@skuruppu I can do the testing for this. Will let you know once it completes. |
@skuruppu the test completed successfully. Thanks for the fix. |
* fix: retry PDML on EOS on DATA error For long-running PDML queries (>= 30mins), there's a possibility that the gRPC connection is terminated with an error "Received unexpected EOS on DATA frame from server". We now retry the same transaction on this error. Fixes #5209 * test: added unit test to verify retry on error
This bug is related to the Spanner client library.
For long lived transactions (>= 30 minutes), in the case of large PDML changes, it is possible that the gRPC connection is terminated with an error "Received unexpected EOS on DATA frame from server".
In this case, we need to retry the transaction either with the received resume token obtained on reading the stream or from scratch. This will ensure that the PDML transaction continues to execute until it is successful or a hard timeout is reached.
We have already implemented such change in the Java client library, for more information see this PR: googleapis/java-spanner#360.
In order to test the fix, we can use a large spanner database. Please speak to @thiagotnunes for more details.
The text was updated successfully, but these errors were encountered: