Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: replace default retry for upload operations #480

Merged
merged 11 commits into from Jun 30, 2021
51 changes: 25 additions & 26 deletions google/cloud/storage/blob.py
Expand Up @@ -82,7 +82,6 @@
from google.cloud.storage.retry import DEFAULT_RETRY
from google.cloud.storage.retry import DEFAULT_RETRY_IF_ETAG_IN_JSON
from google.cloud.storage.retry import DEFAULT_RETRY_IF_GENERATION_SPECIFIED
from google.cloud.storage.retry import DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED
from google.cloud.storage.fileio import BlobReader
from google.cloud.storage.fileio import BlobWriter

Expand Down Expand Up @@ -1703,10 +1702,10 @@ def _do_multipart_upload(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type predefined_acl: str
Expand Down Expand Up @@ -1750,7 +1749,7 @@ def _do_multipart_upload(

This private method does not accept ConditionalRetryPolicy values
because the information necessary to evaluate the policy is instead
evaluated in client.download_blob_to_file().
evaluated in blob._do_upload().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it looks like we're evaluating it BOTH in blob._do_upload() and in client.download_blob_to_file(). This is harmless, which is why the tests didn't catch it, but we only need one of those.

I don't have a strong opinion on whether we do this in _do_upload or download_blob_to_file. Let's keep this docstring change and additionally we can remove the conditional retry policy code in client.download_blob_to_file, either in another PR, or in this one if you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, I believe we'll need to remain both ConditionalRetryPolicy evaluations as one is for upload operations and the other is for downloads. I revised the docstrings here to match particularly where we're doing the evaluations for upload operations, blob._do_upload() .


See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand Down Expand Up @@ -1877,10 +1876,10 @@ def _initiate_resumable_upload(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
tritone marked this conversation as resolved.
Show resolved Hide resolved
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type extra_headers: dict
Expand Down Expand Up @@ -1936,7 +1935,7 @@ def _initiate_resumable_upload(

This private method does not accept ConditionalRetryPolicy values
because the information necessary to evaluate the policy is instead
evaluated in client.download_blob_to_file().
evaluated in blob._do_upload().

See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand Down Expand Up @@ -2070,10 +2069,10 @@ def _do_resumable_upload(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type predefined_acl: str
Expand Down Expand Up @@ -2119,7 +2118,7 @@ def _do_resumable_upload(

This private method does not accept ConditionalRetryPolicy values
because the information necessary to evaluate the policy is instead
evaluated in client.download_blob_to_file().
evaluated in blob._do_upload().

See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand Down Expand Up @@ -2204,10 +2203,10 @@ def _do_upload(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type predefined_acl: str
Expand Down Expand Up @@ -2258,7 +2257,7 @@ def _do_upload(
This class exists to provide safe defaults for RPC calls that are
not technically safe to retry normally (due to potential data
duplication or other side-effects) but become safe to retry if a
condition such as if_metageneration_match is set.
condition such as if_generation_match is set.

See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand Down Expand Up @@ -2337,7 +2336,7 @@ def upload_from_file(
if_metageneration_not_match=None,
timeout=_DEFAULT_TIMEOUT,
checksum=None,
retry=DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED,
retry=DEFAULT_RETRY_IF_GENERATION_SPECIFIED,
):
"""Upload the contents of this blob from a file-like object.

Expand Down Expand Up @@ -2397,10 +2396,10 @@ def upload_from_file(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type client: :class:`~google.cloud.storage.client.Client`
Expand Down Expand Up @@ -2456,7 +2455,7 @@ def upload_from_file(
This class exists to provide safe defaults for RPC calls that are
not technically safe to retry normally (due to potential data
duplication or other side-effects) but become safe to retry if a
condition such as if_metageneration_match is set.
condition such as if_generation_match is set.

See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand All @@ -2479,7 +2478,7 @@ def upload_from_file(
# num_retries and retry are mutually exclusive. If num_retries is
# set and retry is exactly the default, then nullify retry for
# backwards compatibility.
if retry is DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED:
if retry is DEFAULT_RETRY_IF_GENERATION_SPECIFIED:
retry = None

_maybe_rewind(file_obj, rewind=rewind)
Expand Down Expand Up @@ -2518,7 +2517,7 @@ def upload_from_filename(
if_metageneration_not_match=None,
timeout=_DEFAULT_TIMEOUT,
checksum=None,
retry=DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED,
retry=DEFAULT_RETRY_IF_GENERATION_SPECIFIED,
):
"""Upload this blob's contents from the content of a named file.

Expand Down Expand Up @@ -2558,10 +2557,10 @@ def upload_from_filename(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type predefined_acl: str
Expand Down Expand Up @@ -2612,7 +2611,7 @@ def upload_from_filename(
This class exists to provide safe defaults for RPC calls that are
not technically safe to retry normally (due to potential data
duplication or other side-effects) but become safe to retry if a
condition such as if_metageneration_match is set.
condition such as if_generation_match is set.

See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand Down Expand Up @@ -2656,7 +2655,7 @@ def upload_from_string(
if_metageneration_not_match=None,
timeout=_DEFAULT_TIMEOUT,
checksum=None,
retry=DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED,
retry=DEFAULT_RETRY_IF_GENERATION_SPECIFIED,
):
"""Upload contents of this blob from the provided string.

Expand Down Expand Up @@ -2687,10 +2686,10 @@ def upload_from_string(
:type num_retries: int
:param num_retries:
Number of upload retries. By default, only uploads with
if_metageneration_match set will be retried, as uploads without the
if_generation_match set will be retried, as uploads without the
argument are not guaranteed to be idempotent. Setting num_retries
will override this default behavior and guarantee retries even when
if_metageneration_match is not set. (Deprecated: This argument
if_generation_match is not set. (Deprecated: This argument
will be removed in a future release.)

:type client: :class:`~google.cloud.storage.client.Client`
Expand Down Expand Up @@ -2746,7 +2745,7 @@ def upload_from_string(
This class exists to provide safe defaults for RPC calls that are
not technically safe to retry normally (due to potential data
duplication or other side-effects) but become safe to retry if a
condition such as if_metageneration_match is set.
condition such as if_generation_match is set.

See the retry.py source code and docstrings in this package
(google.cloud.storage.retry) for information on retry types and how
Expand Down
6 changes: 3 additions & 3 deletions google/cloud/storage/fileio.py
Expand Up @@ -18,7 +18,7 @@
from google.api_core.exceptions import RequestRangeNotSatisfiable
from google.cloud.storage._helpers import _NUM_RETRIES_MESSAGE
from google.cloud.storage.retry import DEFAULT_RETRY
from google.cloud.storage.retry import DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED
from google.cloud.storage.retry import DEFAULT_RETRY_IF_GENERATION_SPECIFIED
from google.cloud.storage.retry import ConditionalRetryPolicy


Expand Down Expand Up @@ -278,7 +278,7 @@ def __init__(
blob,
chunk_size=None,
text_mode=False,
retry=DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED,
retry=DEFAULT_RETRY_IF_GENERATION_SPECIFIED,
**upload_kwargs
):
for kwarg in upload_kwargs:
Expand Down Expand Up @@ -346,7 +346,7 @@ def _initiate_upload(self):
# num_retries and retry are mutually exclusive. If num_retries is
# set and retry is exactly the default, then nullify retry for
# backwards compatibility.
if retry is DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED:
if retry is DEFAULT_RETRY_IF_GENERATION_SPECIFIED:
retry = None

# Handle ConditionalRetryPolicy.
Expand Down
15 changes: 6 additions & 9 deletions tests/unit/test_blob.py
Expand Up @@ -29,7 +29,6 @@
from google.cloud.storage.retry import DEFAULT_RETRY
from google.cloud.storage.retry import DEFAULT_RETRY_IF_ETAG_IN_JSON
from google.cloud.storage.retry import DEFAULT_RETRY_IF_GENERATION_SPECIFIED
from google.cloud.storage.retry import DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED


def _make_credentials():
Expand Down Expand Up @@ -2853,8 +2852,8 @@ def _do_upload_helper(
**timeout_kwarg
)

if retry is DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED:
retry = DEFAULT_RETRY if if_metageneration_match else None
if retry is DEFAULT_RETRY_IF_GENERATION_SPECIFIED:
retry = DEFAULT_RETRY if if_generation_match else None

self.assertIs(created_json, mock.sentinel.json)
response.json.assert_called_once_with()
Expand Down Expand Up @@ -2925,11 +2924,11 @@ def test__do_upload_with_num_retries(self):

def test__do_upload_with_conditional_retry_success(self):
self._do_upload_helper(
retry=DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED, if_metageneration_match=1
retry=DEFAULT_RETRY_IF_GENERATION_SPECIFIED, if_generation_match=123456
)

def test__do_upload_with_conditional_retry_failure(self):
self._do_upload_helper(retry=DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED)
self._do_upload_helper(retry=DEFAULT_RETRY_IF_GENERATION_SPECIFIED)

def _upload_from_file_helper(self, side_effect=None, **kwargs):
from google.cloud._helpers import UTC
Expand All @@ -2955,7 +2954,7 @@ def _upload_from_file_helper(self, side_effect=None, **kwargs):
if_metageneration_not_match = kwargs.get("if_metageneration_not_match", None)
num_retries = kwargs.get("num_retries", None)
default_retry = (
DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED if not num_retries else None
DEFAULT_RETRY_IF_GENERATION_SPECIFIED if not num_retries else None
)
retry = kwargs.get("retry", default_retry)
ret_val = blob.upload_from_file(
Expand Down Expand Up @@ -3062,9 +3061,7 @@ def _do_upload_mock_call_helper(

expected_timeout = self._get_default_timeout() if timeout is None else timeout
if not retry:
retry = (
DEFAULT_RETRY_IF_METAGENERATION_SPECIFIED if not num_retries else None
)
retry = DEFAULT_RETRY_IF_GENERATION_SPECIFIED if not num_retries else None
self.assertEqual(
kwargs, {"timeout": expected_timeout, "checksum": None, "retry": retry}
)
Expand Down
4 changes: 2 additions & 2 deletions tests/unit/test_fileio.py
Expand Up @@ -395,7 +395,7 @@ def test_conditional_retry_pass(self):
blob,
chunk_size=chunk_size,
content_type=PLAIN_CONTENT_TYPE,
if_metageneration_match=1,
if_generation_match=123456,
)

# The transmit_next_chunk method must actually consume bytes from the
Expand All @@ -421,7 +421,7 @@ def test_conditional_retry_pass(self):
None, # num_retries
chunk_size=chunk_size,
retry=DEFAULT_RETRY,
if_metageneration_match=1,
if_generation_match=123456,
)
upload.transmit_next_chunk.assert_called_with(transport)
self.assertEqual(upload.transmit_next_chunk.call_count, 4)
Expand Down