Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correct lastChunk retry logic in BlobWriteChannel #918

Merged
merged 3 commits into from Jul 13, 2021
Merged

fix: correct lastChunk retry logic in BlobWriteChannel #918

merged 3 commits into from Jul 13, 2021

Conversation

BenWhitehead
Copy link
Collaborator

Add new method StorageRpc#queryResumableUpload which allows getting a shallow StorageObject for a resumable upload session which is complete.

Update BlobWriteChannel to use StoageRpc#queryResumableUpload instead of StorageRpc#get when attempting to validate the upload size of an object when it determines the upload is complete and is on the last chunk.

If a BlobWriteChannel is opened with a conditional like IfGenerationMatch it is not possible to simply get the object, as the object can drift generationally while the resumable upload is being performed.

Related to #839

@BenWhitehead BenWhitehead requested review from JesseLovelace and a team July 9, 2021 16:44
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/java-storage API. label Jul 9, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jul 9, 2021
Add new method StorageRpc#queryResumableUpload which allows getting a shallow StorageObject for a resumable upload session which is complete.

Update BlobWriteChannel to use StoageRpc#queryResumableUpload instead of StorageRpc#get when attempting to validate the upload size of an object when it determines the upload is complete and is on the last chunk.

If a BlobWriteChannel is opened with a conditional like IfGenerationMatch it is not possible to simply get the object, as the object can drift generationally while the resumable upload is being performed.

Related to #839
@BenWhitehead BenWhitehead requested a review from a team as a code owner July 9, 2021 16:51
// String s = response.parseAsString();
// throw new RuntimeException("kaboobm");
} else {
throw buildStorageException(response.getStatusCode(), response.getStatusMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this means that the caller would have made a request to do an upload, but they get an error from the request to check the upload status which is a separate RPC. This might be confusing-- is there a way to wrap this in an error that corresponds to the method they originally called?

Also, if the response is a 308, I would assume that instead of failing we should retry from the last offset, no?

(Ignore me if I'm just missing how this works).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, when a resumable upload completes the shallow Object will be returned in the response. If the response is not received for any reason (network failure usually), this method allows for querying for that shallow Object. This method is used as part of the retry evaluation logic to determine if a final chunk must be retransmitted or not. This method will only return cleanly if the resumable upload returns 200, if a 308 is returned due to an incomplete upload a StorageException will be thrown here to signal that fact. In the case of BlobWriteChannel which uses this method, the exception would be caught by the retry handler and be evaluated there.

I've added a javadoc to the method on there interface with an explanation of why it's there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, the javadoc really helps, thank you.

So, in the 308 case-- would we then (if in BlobWriteChannel) make another identical PUT request to query the correct offset to retry at? If so, is there a way to preserve the information from the 308 response here instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of BlobWriteChannel it performs a pre-check to try and determine the remote offset, if the remote offset indicates that the data has been written but for some reason it wasn't able to get the object metadata as part of the final PUT, it will then use this method to try and get that object metadata.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a 308 happens, that would land us in the normal retry flow, where right now unfortunately we don't have the ability to rewind and retry a chunk to try and recover.

In practice, with BlobWriteChannel any call to this new method is guarded by a check to see if the resumable session is complete or not (in fact I would have prefered to make this check an internal implementation detail but unfortunately BlobWriteChannel does not have the means of accessing the http client directly in order to be able to make this call). So, for all intents and purposes we side step the need to be able to rewind a chunk.

@BenWhitehead BenWhitehead merged commit ab0228c into googleapis:master Jul 13, 2021
@BenWhitehead BenWhitehead deleted the resumable-upload-retry-fix branch July 13, 2021 17:00
BenWhitehead added a commit that referenced this pull request Sep 28, 2021
* feat: Remove client side vaildation for lifecycle conditions (#816)

* Remove client side vaildation for lifecycle conditions

* fix lint and suggest updating

(cherry picked from commit 5ec84cc)

* fix: update BucketInfo translation code to properly handle lifecycle rules (#852)

Fixes #850

(cherry picked from commit 3b1df1d)

* fix: improve error detection and reporting for BlobWriteChannel retry state (#846)

Add new checks to ensure a more informative error than NullPointerException is thrown if the StorageObject or it's size are unable to be resolved on the last chunk.

Fixes #839

(cherry picked from commit d0f2184)

* fix: correct lastChunk retry logic in BlobWriteChannel (#918)

Add new method StorageRpc#queryResumableUpload which allows getting a shallow StorageObject for a resumable upload session which is complete.

Update BlobWriteChannel to use StoageRpc#queryResumableUpload instead of StorageRpc#get when attempting to validate the upload size of an object when it determines the upload is complete and is on the last chunk.

If a BlobWriteChannel is opened with a conditional like IfGenerationMatch it is not possible to simply get the object, as the object can drift generationally while the resumable upload is being performed.

Related to #839

(cherry picked from commit ab0228c)

* test: remove error string matching (#861)

It looks like the text for this error on the backend has changed
(sometimes) from "Precondition Failed" to "At least one of the
pre-conditions you specified did not hold". I don't think it's
really necessary to check the exact message in any case given
that we do check for a code of 412, which implies a precondition
failure. I added a check of the error Reason instead,  which is more
standardized.

Fixes #853

(cherry picked from commit 146a3d3)

Co-authored-by: JesseLovelace <43148100+JesseLovelace@users.noreply.github.com>
Co-authored-by: Chris Cotter <cjcotter@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants