New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ETag headers #451
Comments
My team is about at the point where we are needing this. We would love to consider volunteering a contribution if we could get a design/owner partner on the Google side. Would that be something you all would consider? TIA @tseaver |
Thanks for the proposal @daniellehanks (and sorry for the slow response). We don't support these via most of the other GCS client libraries and I am following up with the service team to determine if there's a good reason for that. Will update you soon when I have more information. |
@daniellehanks Can you clarify the usecase, here? Do you want to pass an etag (a list, for What would you expect to happen in the case that there was no different ETag? |
Can it just behave exactly like the meta/generation parameters? Those raise As for my specific use case, it would likely look something like:
Without the etag functionality:
As I went through the exercise of writing this prototype, it looks like calling |
Thanks for the example, and apologies for the slow response. As you note from your example, this is only going to save you network calls if you are able to store the ETag client-side and will have it on hand when you attempt to download the file. Will this be possible for your use case? Otherwise, I don't see how this will be useful at all -- obviously the ETag (and generation number) you get from Assuming this is the case, I discussed with the team and we are open to accepting a PR that adds an Alternatively, you can also potentially store the generation number locally when the object is first created or accessed, and follow the same pattern with that using the existing |
I'm not storing the etags in my backend. The client would be caching them and using them to make conditional requests to my API. My backend is basically just a middle man abstracting the storage layer (and the GCS-specific concept of generations in favor of etags). Is my understanding correct that there is no way to get the etag from GCS via a single call to one of the If my above understanding is correct (that the GCS APIs, not just the client libraries, don't support getting the file content and metadata in a single call), then seems like implementing the etag functionality doesn't actually buy me anything in terms of number of network calls. I guess I had just assumed (incorrectly) that the metadata on the blob would be populated on a call to |
One more question to clarify: Is a conditional call to download that results in a 304 or 412 still billed as a class B operation? If they are not billed, then adding the etag functionality could save us a lot of billed operations. |
Ah, so, if I understand correctly:
Is that accurate? I think I was also confused about the blob data getting updated from the As far as your billing question, I would have to do some research to follow up on this, but I would suspect that you are still charged for the class B operation (but not for the data egress that a successful request would entail obviously). |
Yes, your summary of what I am doing is correct. I found on the billing page that 307, 4XX, and 5XX are generally not billed, but that doesn't include 304 so I would assume same as you that the operation is billed but no network egress. I wasn't able to quickly curl GCS in between meetings today because auth is hard :). Glad you were able to verify the metadata is in the headers on alt=media requests. Can I amend that to my feature request? :D. If etag (and whatever else is available in the headers that already has an existing property on the blob class that can be easily mapped) could be populated on the download call, and conditional etag support were added to the download methods, that would reduce a call in the case of a not-match as well as probably simplify the code a decent amount. If it's not too much work (like ~<500 LOC or a weekend), I'd love to contribute. |
I think there is merit to just updating the metadata on download all by itself. Getting metadata via a separate network request is prone to concurrent modification errors/race conditions. I.e. if I have to download the file and get the etag as two separate requests, I'm not guaranteed that the etag actually belongs to the file contents I got as the file could have been overwritten between calls. |
Sounds good! Yeah I think it would make sense to start with a PR to add the metadata available from headers on download to the blob. Then we can do a second PR with When I do a GET in the JSON API with
Side note, I recommend https://developers.google.com/oauthplayground/ for trying out requests, it handles the auth complications for you which is very nice! |
FWIW, |
Ahh, thanks for the pointer @tseaver. So actually this is pretty straightforward-- we should extract all of generation, metageneration, and etag there as well. |
@tritone Yup. |
@tritone, @daniellehanks Note that an |
@tritone I don't want to use generation as that is GCS specific. Like I said, my backend is just a middle man and my mobile client is the one caching the result and I want that to be implementation detail (GCS) agnostic, i.e. use standard etag for caching. Starting with the PR to pipe through the headers on download calls sounds good. I'll try to find some time to work on that this week or next. |
@tseaver the generation match parameters only take a single value, not a sequence. Seems like the two should behave consistently? |
@daniellehanks RFC 7232 specifies the semantics of the
|
It looks like the if-match/if-not-match for generations in the JSON API are implemented as query parameters and only seem to allow a single value (documentation). I will align with the etag spec and use a sequence for those. Pointers as to where to dive in in the code for adding the etag conditions? Or should I just search for generation and parallel with etags? |
|
Please correct me if I'm wrong, but it appears the upload APIs don't support if-match/if-none-match headers. Since the scope of this request was for download functionality, this doesn't inhibit my use case. Just something to be aware of that etags will not fully parallel generations. |
Draft PR #489 for conditional reads on etag. I just need to add the system test(s) and then I will remove draft status. |
@daniellehanks, @tritone I split out the missing header->property population as #490. |
Support conditional requests based on ETag for read operations (`reload`, `exists`, `download_*`). My own testing seems to indicate that the JSON API does not support ETag If-Match/If-None-Match headers on modify requests (`patch`, `delete`, etc.), please correct me if I am mistaken. This part two of #451. Part one in #488. Fixes #451 🦕
Support conditional requests based on ETag for read operations (`reload`, `exists`, `download_*`). My own testing seems to indicate that the JSON API does not support ETag If-Match/If-None-Match headers on modify requests (`patch`, `delete`, etc.), please correct me if I am mistaken. This part two of googleapis#451. Part one in googleapis#488. Fixes googleapis#451 🦕
Support conditional requests based on ETag for read operations (`reload`, `exists`, `download_*`). My own testing seems to indicate that the JSON API does not support ETag If-Match/If-None-Match headers on modify requests (`patch`, `delete`, etc.), please correct me if I am mistaken. This part two of googleapis#451. Part one in googleapis#488. Fixes googleapis#451 🦕
Split from this issue.
Requesting support for
If-Match
/If-None-Match
headers for GCS via the Python client library.My use case:
My API is a thin layer over GCS (for storing profile images), and I only want to actually read the image data and serve it to the client if the actual content is different from their cache. They will specify an
If-None-Match
header to my API (as they won't know anything about GCS or generation numbers, kind of the point), so I just need to essentially pass this along to GCS. I don't want to either do the extra round tripping of reading the current file to check the ETag and then use the generation number from that in a subsequent request, or not use the client library.The text was updated successfully, but these errors were encountered: