Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AuthMetadataPlugin: gRPC requires it to not block; causes timeouts to not be respected #351

Closed
evanj opened this issue Jul 15, 2019 · 1 comment · Fixed by #390
Closed
Assignees
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@evanj
Copy link

evanj commented Jul 15, 2019

The gRPC documentation for AuthMetadataPlugin.__call__ states: "Implementations of this method must not block." Unfortunately, google.auth.transport.grpc.AuthMetadataPlugin violates this requirement. I believe this causes all gRPC calls using one instance of AuthMetadataPlugin to be blocked as long as token refreshing is happening. This may be causing occasional errors where the timeout we pass to Google Cloud API calls is not being respected, since the calls are all blocked waiting on the OAuth token to be refreshed.

Possible fix

It seems likely this plugin needs to do OAuth token refreshing using a thread pool, like the grpc._auth.GoogleCallCredentials implementation does. This will have the side benefit of not having multiple threads "race" to refresh the credentials quite as much.

A quick-and-dirty "fix" that would reduce the impact of this error would be to add a timeout on the token refreshing requests. This is probably a good idea anyway. However, this would at least ensure the "impact" of slow/blocked token refreshes is minimized. On our case, we have seen processes that may have been "stuck" for ~15 minutes due to this issue.

Blocking call path

When using google.api_core.grpc_helpers.create_channel, which seems to be what is used by Google's generated GAPIC clients:

  1. gRPC calls google.auth.transport.grpc.AuthMetadataPlugin.__call__(...)
  2. Calls google.auth.transport.grpc.AuthMetadataPlugin._get_authorization_headers(...)
  3. Calls self._credentials.before_request(...) which is google.auth.credentials.Credentials.before_request
  4. If the credentials have expired, it calls self.refresh(...). In the case of service accounts, this is a google.oauth2.service_account.Credentials object.
  5. Calls _client.jwt_grant(...) which calls _token_endpoint_request(...)
  6. Calls request(...) which is an instance of google.auth.transport.requests.Request
  7. This calls in to the requests API, which eventually makes a blocking HTTP request.
@evanj evanj changed the title AuthMetadataPlugin: gRPC requires it to not blocks; causes timeouts to not be respected AuthMetadataPlugin: gRPC requires it to not block; causes timeouts to not be respected Jul 15, 2019
@busunkim96 busunkim96 self-assigned this Jul 15, 2019
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Jul 16, 2019
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Jul 29, 2019
@JustinBeckwith JustinBeckwith added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Aug 6, 2019
@yoshi-automation yoshi-automation removed triage me I really want to be triaged. 🚨 This issue needs some love. labels Aug 6, 2019
@michaelawyu michaelawyu self-assigned this Nov 20, 2019
@michaelawyu
Copy link
Contributor

Thanks, @evanj!

A PR has been sent (#389).

michaelawyu pushed a commit that referenced this issue Nov 25, 2019
…or requests transport (#390)

This commit includes the following changes:
- `transport.grpc.AuthMetadataPlugin` is now non-blocking as gRPC requires
- `transport.requests.Request` now has a default timeout value of 120 seconds so that token refreshing will not be stuck

Resolves: #351
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants