Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for Etag headers on reads #489

Merged
merged 17 commits into from Jul 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
48 changes: 44 additions & 4 deletions docs/generation_metageneration.rst
@@ -1,17 +1,31 @@
Conditional Requests Via Generation / Metageneration Preconditions
==================================================================
Conditional Requests Via ETag / Generation / Metageneration Preconditions
=========================================================================

Preconditions tell Cloud Storage to only perform a request if the
:ref:`generation <concept-generation>` or
:ref:`ETag <concept-etag>`, :ref:`generation <concept-generation>`, or
:ref:`metageneration <concept-metageneration>` number of the affected object
meets your precondition criteria. These checks of the generation and
meets your precondition criteria. These checks of the ETag, generation, and
metageneration numbers ensure that the object is in the expected state,
allowing you to perform safe read-modify-write updates and conditional
operations on objects

Concepts
--------

.. _concept-etag:

ETag
::::::::::::::

An ETag is returned as part of the response header whenever a resource is
returned, as well as included in the resource itself. Users should make no
assumptions about the value used in an ETag except that it changes whenever the
underlying data changes, per the
`specification <https://tools.ietf.org/html/rfc7232#section-2.3>`_

The ``ETag`` attribute is set by the GCS back-end, and is read-only in the
client library.

.. _concept-metageneration:

Metageneration
Expand Down Expand Up @@ -59,6 +73,32 @@ See also
Conditional Parameters
----------------------

.. _using-if-etag-match:

Using ``if_etag_match``
:::::::::::::::::::::::::::::

Passing the ``if_etag_match`` parameter to a method which retrieves a
blob resource (e.g.,
:meth:`Blob.reload <google.cloud.storage.blob.Blob.reload>`)
makes the operation conditional on whether the blob's current ``ETag`` matches
the given value. This parameter is not supported for modification (e.g.,
:meth:`Blob.update <google.cloud.storage.blob.Blob.update>`).


.. _using-if-etag-not-match:

Using ``if_etag_not_match``
:::::::::::::::::::::::::::::

Passing the ``if_etag_not_match`` parameter to a method which retrieves a
blob resource (e.g.,
:meth:`Blob.reload <google.cloud.storage.blob.Blob.reload>`)
makes the operation conditional on whether the blob's current ``ETag`` matches
the given value. This parameter is not supported for modification (e.g.,
:meth:`Blob.update <google.cloud.storage.blob.Blob.update>`).


.. _using-if-generation-match:

Using ``if_generation_match``
Expand Down
42 changes: 39 additions & 3 deletions google/cloud/storage/_helpers.py
Expand Up @@ -22,6 +22,7 @@
from datetime import datetime
import os

from six import string_types
from six.moves.urllib.parse import urlsplit
from google import resumable_media
from google.cloud.storage.constants import _DEFAULT_TIMEOUT
Expand All @@ -34,6 +35,12 @@

_DEFAULT_STORAGE_HOST = u"https://storage.googleapis.com"

# etag match parameters in snake case and equivalent header
_ETAG_MATCH_PARAMETERS = (
("if_etag_match", "If-Match"),
("if_etag_not_match", "If-None-Match"),
tseaver marked this conversation as resolved.
Show resolved Hide resolved
)

# generation match parameters in camel and snake cases
_GENERATION_MATCH_PARAMETERS = (
("if_generation_match", "ifGenerationMatch"),
Expand Down Expand Up @@ -147,6 +154,8 @@ def reload(
self,
client=None,
projection="noAcl",
if_etag_match=None,
daniellehanks marked this conversation as resolved.
Show resolved Hide resolved
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand All @@ -168,6 +177,12 @@ def reload(
Defaults to ``'noAcl'``. Specifies the set of
properties to return.

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match: (Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]])
:param if_etag_not_match: (Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -205,10 +220,14 @@ def reload(
if_metageneration_match=if_metageneration_match,
if_metageneration_not_match=if_metageneration_not_match,
)
headers = self._encryption_headers()
_add_etag_match_headers(
headers, if_etag_match=if_etag_match, if_etag_not_match=if_etag_not_match
)
api_response = client._get_resource(
self.path,
query_params=query_params,
headers=self._encryption_headers(),
headers=headers,
timeout=timeout,
retry=retry,
_target_object=self,
Expand Down Expand Up @@ -384,8 +403,7 @@ def update(


def _scalar_property(fieldname):
"""Create a property descriptor around the :class:`_PropertyMixin` helpers.
"""
"""Create a property descriptor around the :class:`_PropertyMixin` helpers."""

def _getter(self):
"""Scalar property getter."""
Expand Down Expand Up @@ -449,6 +467,24 @@ def _convert_to_timestamp(value):
return mtime


def _add_etag_match_headers(headers, **match_parameters):
"""Add generation match parameters into the given parameters list.

:type headers: dict
:param headers: Headers dict.

:type match_parameters: dict
:param match_parameters: if*etag*match parameters to add.
"""
for snakecase_name, header_name in _ETAG_MATCH_PARAMETERS:
value = match_parameters.get(snakecase_name)

if value is not None:
if isinstance(value, string_types):
value = [value]
headers[header_name] = ", ".join(value)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me (though I'm not a spec expert) that the GCS API isn't following spec putting quotes around the etag values. E.g. If-None-Match: COKaz4vVzfECEAE= works as intended but If-None-Match: "COKaz4vVzfECEAE=" does not. That said, the response header is also not quoted as it apparently should be etag: COKaz4vVzfECEAE= vs expected etag: "COKaz4vVzfECEAE=". Referencing this and this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe its just the display layers or something (using Chrome dev tools and curl). Wikipedia does the same thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

microsoft.com has it quoted 馃

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC 7232 is the actual authoritative spec for conditional HTTP requests: If-Match and If-None-Match. The Collected ABNF appendix does indeed seem to require the double quote.

The GCS docs for the ETag header show the quotes as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, all the examples in the spec are also quoted. Fun times. Definitely out of scope for this PR. Just wanted to call it out. I would think the etag itself should have the quotes (i.e. blob.etag = '"Csdlei="'), so then this code would still function correctly. Of course getting etags quoted would be a far larger change spanning the backend, possibly ESF, and probably a fair amount of Hyrum's law.

It just stood out to me because I ran into a similar problem implementing the UI for BigQuery, which also doesn't follow the etag spec. In the end it meant we couldn't cache tables in the UI. But we also couldn't change the API without risking breaking clients. I'll leave it up to you if you want to file an internal ticket for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effectively we're stuck with the as-implemented back-end. @frankyn, @andrewsg, @tswast I'll leave it to y'all to open a ticket on the back-end (if one isn't there already) for the out-of-spec behavior.



def _add_generation_match_parameters(parameters, **match_parameters):
"""Add generation match parameters into the given parameters list.

Expand Down
78 changes: 78 additions & 0 deletions google/cloud/storage/blob.py
Expand Up @@ -59,6 +59,7 @@
from google.cloud._helpers import _rfc3339_nanos_to_datetime
from google.cloud._helpers import _to_bytes
from google.cloud.exceptions import NotFound
from google.cloud.storage._helpers import _add_etag_match_headers
from google.cloud.storage._helpers import _add_generation_match_parameters
from google.cloud.storage._helpers import _PropertyMixin
from google.cloud.storage._helpers import _scalar_property
Expand Down Expand Up @@ -634,6 +635,8 @@ def generate_signed_url(
def exists(
self,
client=None,
if_etag_match=None,
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand All @@ -651,6 +654,14 @@ def exists(
(Optional) The client to use. If not passed, falls back to the
``client`` stored on the blob's bucket.

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match:
(Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]]
:param if_etag_not_match:
(Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -692,12 +703,19 @@ def exists(
if_metageneration_match=if_metageneration_match,
if_metageneration_not_match=if_metageneration_not_match,
)

headers = {}
_add_etag_match_headers(
headers, if_etag_match=if_etag_match, if_etag_not_match=if_etag_not_match
)

try:
# We intentionally pass `_target_object=None` since fields=name
# would limit the local properties.
client._get_resource(
self.path,
query_params=query_params,
headers=headers,
timeout=timeout,
retry=retry,
_target_object=None,
Expand Down Expand Up @@ -1002,6 +1020,8 @@ def download_to_file(
start=None,
end=None,
raw_download=False,
if_etag_match=None,
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand Down Expand Up @@ -1057,6 +1077,14 @@ def download_to_file(
:param raw_download:
(Optional) If true, download the object without any expansion.

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match:
(Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]]
:param if_etag_not_match:
(Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -1121,6 +1149,8 @@ def download_to_file(
start=start,
end=end,
raw_download=raw_download,
if_etag_match=if_etag_match,
if_etag_not_match=if_etag_not_match,
if_generation_match=if_generation_match,
if_generation_not_match=if_generation_not_match,
if_metageneration_match=if_metageneration_match,
Expand All @@ -1137,6 +1167,8 @@ def download_to_filename(
start=None,
end=None,
raw_download=False,
if_etag_match=None,
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand Down Expand Up @@ -1168,6 +1200,14 @@ def download_to_filename(
:param raw_download:
(Optional) If true, download the object without any expansion.

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match:
(Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]]
:param if_etag_not_match:
(Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -1233,6 +1273,8 @@ def download_to_filename(
start=start,
end=end,
raw_download=raw_download,
if_etag_match=if_etag_match,
if_etag_not_match=if_etag_not_match,
if_generation_match=if_generation_match,
if_generation_not_match=if_generation_not_match,
if_metageneration_match=if_metageneration_match,
Expand Down Expand Up @@ -1260,6 +1302,8 @@ def download_as_bytes(
start=None,
end=None,
raw_download=False,
if_etag_match=None,
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand Down Expand Up @@ -1288,6 +1332,14 @@ def download_as_bytes(
:param raw_download:
(Optional) If true, download the object without any expansion.

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match:
(Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]]
:param if_etag_not_match:
(Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -1355,6 +1407,8 @@ def download_as_bytes(
start=start,
end=end,
raw_download=raw_download,
if_etag_match=if_etag_match,
if_etag_not_match=if_etag_not_match,
if_generation_match=if_generation_match,
if_generation_not_match=if_generation_not_match,
if_metageneration_match=if_metageneration_match,
Expand All @@ -1371,6 +1425,8 @@ def download_as_string(
start=None,
end=None,
raw_download=False,
if_etag_match=None,
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand Down Expand Up @@ -1401,6 +1457,14 @@ def download_as_string(
:param raw_download:
(Optional) If true, download the object without any expansion.

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match:
(Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]]
:param if_etag_not_match:
(Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -1460,6 +1524,8 @@ def download_as_string(
start=start,
end=end,
raw_download=raw_download,
if_etag_match=if_etag_match,
if_etag_not_match=if_etag_not_match,
if_generation_match=if_generation_match,
if_generation_not_match=if_generation_not_match,
if_metageneration_match=if_metageneration_match,
Expand All @@ -1475,6 +1541,8 @@ def download_as_text(
end=None,
raw_download=False,
encoding=None,
if_etag_match=None,
if_etag_not_match=None,
if_generation_match=None,
if_generation_not_match=None,
if_metageneration_match=None,
Expand Down Expand Up @@ -1507,6 +1575,14 @@ def download_as_text(
downloaded bytes. Defaults to the ``charset`` param of
attr:`content_type`, or else to "utf-8".

:type if_etag_match: Union[str, Set[str]]
:param if_etag_match:
(Optional) See :ref:`using-if-etag-match`

:type if_etag_not_match: Union[str, Set[str]]
:param if_etag_not_match:
(Optional) See :ref:`using-if-etag-not-match`

:type if_generation_match: long
:param if_generation_match:
(Optional) See :ref:`using-if-generation-match`
Expand Down Expand Up @@ -1558,6 +1634,8 @@ def download_as_text(
start=start,
end=end,
raw_download=raw_download,
if_etag_match=if_etag_match,
if_etag_not_match=if_etag_not_match,
if_generation_match=if_generation_match,
if_generation_not_match=if_generation_not_match,
if_metageneration_match=if_metageneration_match,
Expand Down