Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Bug: Uncaught exception when calling oauth2.id_token.fetch_id_token #663

Closed
yaseenlotfi opened this issue Jan 10, 2021 · 6 comments
Assignees
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@yaseenlotfi
Copy link

Fail to retrieve an ID token because the workflow is broken prematurely. This function tries to fetch the ID token from the Compute Metadata Server, first. If that fails, then it tries loading credentials from the environment variable. Raises a final error if it can't do either. In my case, I'm trying to fetch the ID token locally so the metadata server URL cannot be resolved. This causes an uncaught TypeError that breaks before trying to use the environment variable.

Environment details

  • OS: MacOS Big Sur v11.1
  • Python version: 3.7.7
  • pip version: 20.2.4
  • google-auth version: 1.21.1

Steps to reproduce

  1. Running locally with $GOOGLE_APPLICATION_CREDENTIALS set to valid service account key filepath.
  2. My specific use case is invoking a Cloud Run app but I can reproduce with the example provided in the source docs.

Reproduced by running:

import google.oauth2.id_token
import google.auth.transport.requests

request = google.auth.transport.requests.Request()
target_audience = "https://pubsub.googleapis.com"

id_token = google.oauth2.id_token.fetch_id_token(request, target_audience)  # TypeError is thrown here

Sample Traceback

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/path/to/lib/python3.7/site-packages/google/oauth2/id_token.py", line 225, in fetch_id_token
    request, audience, use_metadata_identity_endpoint=True
  File "/path/to/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 207, in __init__
    self._service_account_email = sa_info["email"]
TypeError: string indices must be integers

Explanation

The workflow breaks prematurely while checking the Compute Metadata Server. Since I am running locally, I would expect the behavior to be such that it fails to reach the metadata server so it attempts to load credentials from the environment variable instead.

What's happening is that the response from the unreachable metadata server is a string instead of the dictionary this function expects. It assumes a dictionary is returned and tries accessing the email attribute directly (sa_info.get("email") would be safer). See reference:

self._service_account_email = sa_info["email"]

This variable, sa_info, is meant to be returned by compute_engine._metadata.get_service_account_info.
See reference:

def get_service_account_info(request, service_account="default"):

Solution

More of a workaround than a solution, I added a TypeError to the first try-except block in id_token.fetch_id_token. See reference:

except (ImportError, exceptions.TransportError, exceptions.RefreshError):

Works well enough but I'm wondering what the best approach is to solve this. Is there a different way to get this OIDC token locally in a Pythonic way? I've used the subprocess module before to call gcloud directly but that's not ideal. This function is appealing because the same code can work either in GCP or locally.

Something else I don't fully understand is why making a request to the metadata server externally still returns a 200 status code. Is there a simple check of whether the server can be accessed or not?

@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jan 11, 2021
@busunkim96
Copy link
Contributor

Hi @yaseenlotfi,

Thanks for the thorough explanation. It seems like the existing behavior is incorrect. On a first pass I think get() might be missing some additional error handling. The metadata server is only available inside GCP runtime environments so a call to it should fail with a TransportError locally. 🤔 It's odd that the call seems to succeed.

if response.status == http_client.OK:
content = _helpers.from_bytes(response.data)
if response.headers["content-type"] == "application/json":
try:
return json.loads(content)
except ValueError as caught_exc:
new_exc = exceptions.TransportError(
"Received invalid JSON from the Google Compute Engine"
"metadata service: {:.20}".format(content)
)
six.raise_from(new_exc, caught_exc)
else:
return content
else:
raise exceptions.TransportError(
"Failed to retrieve {} from the Google Compute Engine"
"metadata service. Status: {} Response:\n{}".format(
url, response.status, response.data
),
response,
)

Could you check if GCE_METADATA_HOST or GCE_METADATA_ROOT is set in your environment?

GCE_METADATA_HOST = "GCE_METADATA_HOST"
GCE_METADATA_ROOT = "GCE_METADATA_ROOT"
"""Environment variable providing an alternate hostname or host:port to be
used for GCE metadata requests.
This environment variable is originally named GCE_METADATA_ROOT. System will
check the new variable first; should there be no value present,
the system falls back to the old variable.
"""

@busunkim96 busunkim96 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed triage me I really want to be triaged. labels Jan 11, 2021
@yaseenlotfi
Copy link
Author

yaseenlotfi commented Jan 12, 2021

Confirming that neither of those variables are set in my environment.

I just called the Metadata Server directly and requests.get is returning HTML from my ISP:
Screen Shot 2021-01-11 at 11 11 05 PM

So that is technically expected but then the question is how to handle it.

Code to reproduce:

import requests

metadata_server_url = 'http://metadata.google.internal/'
endpoint = '/computeMetadata/v1/project/project-id'
headers = {'Metadata-Flavor': 'Google'}
response = requests.get(
    url=f'{metadata_server_url}/{endpoint}', headers=headers)

print(response.status_code)
print(response.text)

EDIT:
Also called _metadata.get directly and returned the same ISP message.

import google.auth.transport.requests

from google.auth.compute_engine import _metadata

req = google.auth.transport.requests.Request()
res = _metadata.get(req, 'project/project-id')  # not sure if that's the right path
print(res)

@mgmanzella
Copy link

I've encountered the same issue (under the same environment conditions the author originally posted) and I was wondering if anyone had found a workaround? Is there a version of the library that doesn't have this issue? Thanks in advance!

@yaseenlotfi
Copy link
Author

My workaround was basically to implement my own check if the app is running locally vs a Cloud environment. Makes a request to compute metadata server and check the header. I assume it's from Google if there is a header called "Metadata-Flavor". Unfortunately, the status code is 200, either way.

Knowing which environment you're in allows you do replicate the auth workflow on you own ie try to get token from compute, then try default credentials, etc.

@parthea
Copy link
Contributor

parthea commented Jul 12, 2021

@arithmetic1728 Please can you take a look?

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jul 13, 2021
@arithmetic1728
Copy link
Contributor

@parthea yep, there is a pending PR to fix it #748

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

6 participants