Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching Introduced by Pull #374 Absorbs Errors and Impedes Error Retry #523

Closed
githubwua opened this issue Feb 15, 2021 · 3 comments · Fixed by #532
Closed

Caching Introduced by Pull #374 Absorbs Errors and Impedes Error Retry #523

githubwua opened this issue Feb 15, 2021 · 3 comments · Fixed by #532
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: docs Improvement to the documentation for an API.

Comments

@githubwua
Copy link

The following commit introduced request caching.

85bf2bc

Caching catches and aborbs errors and prevents the retrying library from catching errors.

As a result, failed requests are not being caught and retried.

Environment details

Last Known good version 2.3.1

google-cloud-bigquery==2.3.1

First version that broke retry

google-cloud-bigquery==2.4.0

Steps to reproduce

  1. Run repro.py below with google-cloud-bigquery==2.3.1
    Result: Error is retried until timeout at 60 sec

  2. Run repro.py below with google-cloud-bigquery==2.4.0
    Result: Error is not being retried. Script exits early.

Code example

from google import api_core
from google.cloud import bigquery
from google.api_core import retry

if_transient_error = retry.if_exception_type(Exception,)

RETRY_INITIAL = 1
RETRY_MAXIMUM = 10
RETRY_MULTIPLIER = 2
RETRY_DEADLINE = 60

my_retry = retry.Retry(
    predicate=if_transient_error,
    initial=RETRY_INITIAL,
    maximum=RETRY_MAXIMUM,
    multiplier=RETRY_MULTIPLIER,
    deadline=RETRY_DEADLINE
)

# retry test
print("bq version:", bigquery.__version__)
print("api-core version:", api_core.__version__)

client = bigquery.Client()
sql = "hoge"
query_job = client.query(sql, retry=my_retry)
for row in query_job.result(retry=my_retry):
    print(row)
google-api-core==1.23.0

# Last Known good version 2.3.1
#google-cloud-bigquery==2.3.1

# First version that broke retry
google-cloud-bigquery==2.4.0

Stack trace

# Error is not being retried starting from google-cloud-bigquery==2.4.0

$ python3 repro.py 
bq version: 2.4.0
api-core version: 1.23.0
Traceback (most recent call last):
  File "repro.py", line 31, in <module>
    for row in query_job.result(retry=my_retry):
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1160, in result
    super(QueryJob, self).result(retry=retry, timeout=timeout)
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 631, in result
    return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 134, in result
    raise self._exception
google.api_core.exceptions.BadRequest: 400 Syntax error: Expected end of input but got identifier "hoge" at [1:1]


# In google-cloud-bigquery==2.3.1, error is retried as configured

$ python3 repro.py 
bq version: 2.3.1
api-core version: 1.23.0
Traceback (most recent call last):
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 184, in retry_target
    return target()
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/_http.py", line 438, in api_request
    raise exceptions.from_http_response(response)
google.api_core.exceptions.BadRequest: 400 GET https://bigquery.googleapis.com/bigquery/v2/projects/wua-repro/queries/1ca1cb14-f9a5-4298-9058-91ab784be278?maxResults=0&location=US&prettyPrint=false: Syntax error: Expected end of input but got identifier "hoge" at [1:1]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 107, in _blocking_poll
    retry_(self._done_or_raise)(**kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
    return retry_target(
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 184, in retry_target
    return target()
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 85, in _done_or_raise
    if not self.done(**kwargs):
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1022, in done
    self._query_results = self._client._get_query_results(
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 1557, in _get_query_results
    resource = self._call_api(
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 636, in _call_api
    return call()
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
    return retry_target(
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/retry.py", line 199, in retry_target
    six.raise_from(
  File "<string>", line 3, in raise_from
google.api_core.exceptions.RetryError: Deadline of 60.0s exceeded while calling functools.partial(functools.partial(<bound method JSONConnection.api_request of <google.cloud.bigquery._http.Connection object at 0x7f9b79f60e50>>, method='GET', path='/projects/wua-repro/queries/1ca1cb14-f9a5-4298-9058-91ab784be278', query_params={'maxResults': 0, 'location': 'US'}, timeout=None)), last exception: 400 GET https://bigquery.googleapis.com/bigquery/v2/projects/wua-repro/queries/1ca1cb14-f9a5-4298-9058-91ab784be278?maxResults=0&location=US&prettyPrint=false: Syntax error: Expected end of input but got identifier "hoge" at [1:1]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "repro.py", line 31, in <module>
    for row in query_job.result(retry=my_retry):
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1146, in result
    super(QueryJob, self).result(retry=retry, timeout=timeout)
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 631, in result
    return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 129, in result
    self._blocking_poll(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1042, in _blocking_poll
    super(QueryJob, self)._blocking_poll(timeout=timeout, **kwargs)
  File "/tmp/.venv/lib/python3.8/site-packages/google/api_core/future/polling.py", line 109, in _blocking_poll
    raise concurrent.futures.TimeoutError(
concurrent.futures._base.TimeoutError: Operation did not complete within the designated timeout.

Note: retry works fine if we roll back the change in google/cloud/bigquery/job/query.py

Can we either fix google/cloud/bigquery/job/query.py or roll it back to previous version?

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Feb 15, 2021
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Feb 16, 2021
@plamut plamut added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed triage me I really want to be triaged. labels Feb 16, 2021
@plamut
Copy link
Contributor

plamut commented Feb 16, 2021

I can reproduce the behavior as reported, switching from v2.3.1 to v2.4.0 (or later) indeed makes a difference. I can also confirm that the behavior changed with #374.

(I used google-api-core==1.24.1, though, but the result is the same)

Interestingly, #374 was later reverted in #400, but the retry issue remained.

@tswast
Copy link
Contributor

tswast commented Feb 19, 2021

This is intended behavior. When a job fails, we have always raised an exception in result(). Retrying result() does not retry the query. Thus, the extra API calls we made before v2.3.1 were unnecessary. Once a job's state reaches DONE, it will not change.

Will keep this issue open to update the documentation strings with this information.

@plamut
Copy link
Contributor

plamut commented Feb 19, 2021

@githubwua OK, so here's the thing - after digging in this actually appears to be expected behavior that was changed here and later retained even after the revert.

The reason is that when the job is DONE, it is done for good and it will not change anymore, thus any further retries are redundant. In v2.3.1 and earlier, the job that failed with a syntax error was not considered done yet, as there were no query results available, but it turned out that the client still retrying was sub-optimal.

Edit: Tim beat me to it. Indeed, we need to clarify this in the docs, thus I'm re-classifying this.

@plamut plamut added type: docs Improvement to the documentation for an API. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: docs Improvement to the documentation for an API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants