Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent RefreshError with Internal Server Error on metadata service #1285

Open
lawrenceong opened this issue May 4, 2023 · 3 comments

Comments

@lawrenceong
Copy link

Recently, we are getting intermittent RefreshError from python applications using google cloud services. The following is a stack trace:

class: <class 'google.auth.exceptions.RefreshError'>
message: (\"Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/SA_NAME@PROJECT.iam.gserviceaccount.com/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.read_only%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.read_write from the Google Compute Engine metadata service. Status: 500 Response:\\nb'Internal Server Error\\\\n'\", <google.auth.transport.requests._Response object at 0x7f4b8a2320b0>)

traceback:

...<snip>...
  File "/app/src/utils/cloud_storage.py", line 13, in __init__
    self.bucket: Bucket = self.client.get_bucket(self.bucket_name)
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/storage/client.py", line 772, in get_bucket
    bucket.reload(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/storage/bucket.py", line 1086, in reload
    super(Bucket, self).reload(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/storage/_helpers.py", line 246, in reload
    api_response = client._get_resource(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/storage/client.py", line 377, in _get_resource
    return self._connection.api_request(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/storage/_http.py", line 72, in api_request
    return call()
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/api_core/retry.py", line 349, in retry_wrapped_func
    return retry_target(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/api_core/retry.py", line 191, in retry_target
    return target()
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/_http/__init__.py", line 482, in api_request
    response = self._make_request(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/_http/__init__.py", line 341, in _make_request
    return self._do_request(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/cloud/_http/__init__.py", line 379, in _do_request
    return self.http.request(
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/auth/transport/requests.py", line 545, in request
    self.credentials.before_request(auth_request, method, url, request_headers)
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/auth/credentials.py", line 135, in before_request
    self.refresh(request)
  File "/app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/auth/compute_engine/credentials.py", line 117, in refresh
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
Frame before_request in /app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/auth/credentials.py at line 135
    self                 = <google.auth....x7f4b8a233e50>
    request              = functools.par...>, timeout=60)
    method               = 'GET'
    url                  = 'https://stor...tyPrint=false'
    headers              = {'Accept-Encoding': 'gzip', 'User-Agent': 'gcloud-pytho....0 gccl/2.8.0', 'X-Goog-API-Client': 'gcloud-pytho...-7aa99e2dd464'}
Frame refresh in /app/.cache/pypoetry/virtualenvs/appName/lib/python3.10/site-packages/google/auth/compute_engine/credentials.py at line 117
    self                 = <google.auth....x7f4b8a233e50>
    request              = functools.par...>, timeout=60)
    scopes               = ('https://www.....full_control', 'https://www....age.read_only', 'https://www....ge.read_write')
    new_exc              = RefreshError(...f4b8a2320b0>))
Frame raise_from in <string> at line 5
    value                = None
    from_value           = TransportErro...7f4b8a2320b0>)

The Internal Server Error seems to be happening only on python based instances. To workaround these errors, a tenacity retry was added on the function. Sample retry:

    @retry(
        retry=retry_if_exception_type(GoogleAuthError),
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=4, max=10),
    )

We are using node to access google cloud storage as well and do not get anything similar.

Is there any reason why we would get "intermittent" Internal Server Error when trying to refresh a service account's token?

Environment details

  • OS: GKE - containerd - v1.25.7-gke.1000 - with python:3.10-slim image - based on debian bullseye
  • Python version: 3.10.11
  • pip version: 23.1
  • google-auth version: 2.17.3

Steps to reproduce

Initialise the bucket via code and the issue will happen intermittently (around once every week). Code sample where issue is happening at startup of pod:

from google.cloud import storage

BUCKET_NAME = ".............."

client = storage.Client()
bucket = client.get_bucket(BUCKET_NAME)
@clundin25
Copy link
Contributor

Hi @lawrenceong,

We are working on adding these retries into the client layer. Once that is complete these errors will be retried automatically.

Your workaround will work until then.

We are not planning to add any more retries to this codebase to create a single source of retries.

Thanks!

@lawrenceong
Copy link
Author

We are using pubsub as well and encountered a similar error. It does not seem to affect usability, so did not have to add a retry. However, we get the following in the logs which trigger alarms via error reporting:

Traceback (most recent call last):
  File "/app/.cache/pypoetry/virtualenvs/APP_NAME/lib/python3.10/site-packages/grpc/_plugin_wrapping.py", line 95, in __call__
    self._metadata_plugin(
  File "/app/.cache/pypoetry/virtualenvs/APP_NAME/lib/python3.10/site-packages/google/auth/transport/grpc.py", line 101, in __call__
    callback(self._get_authorization_headers(context), None)
  File "/app/.cache/pypoetry/virtualenvs/APP_NAME/lib/python3.10/site-packages/google/auth/transport/grpc.py", line 87, in _get_authorization_headers
    self._credentials.before_request(
  File "/app/.cache/pypoetry/virtualenvs/APP_NAME/lib/python3.10/site-packages/google/auth/credentials.py", line 135, in before_request
    self.refresh(request)
  File "/app/.cache/pypoetry/virtualenvs/APP_NAME/lib/python3.10/site-packages/google/auth/compute_engine/credentials.py", line 117, in refresh
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/SA_NAME@PROJECT.iam.gserviceaccount.com/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 500 Response:\nb'Internal Server Error\\n'", <google.auth.transport.requests._Response object at 0x7fa29a86ed40>)

@mdzigurski
Copy link

I also see many google.auth.exceptions.RefreshError exceptions and 500 Internal Server Errors in the logs. Is there any solution to fix these from happening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants