Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Bidi uses metadata from start_rpc #205

Closed
wants to merge 1 commit into from
Closed

Conversation

busunkim96
Copy link
Contributor

@busunkim96 busunkim96 commented Jun 10, 2021

Allow Bidi to use metadata from start_rpc when no other metadata
is provided.

This allows client_info on wrapped client methods will be passed through to BidiRpc.

Fixes #202 馃

Not expected to impact Firestore, as metadata is explicitly passed: https://github.com/googleapis/python-firestore/blob/e57258c51e4b4aa664cc927454056412756fc7ac/google/cloud/firestore_v1/watch.py#L220-L226

Pub/Sub does not seem to pass metadata and may be impacted: https://github.com/googleapis/python-pubsub/blob/e907f6e05f59f64a3b08df3304e92ec960997be6/google/cloud/pubsub_v1/subscriber/_protocol/streaming_pull_manager.py#L523-L528

Allow Bidi to use metadata from start_rpc when no other metadata
is provided.
@snippet-bot
Copy link

snippet-bot bot commented Jun 10, 2021

No region tags are edited in this PR.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jun 10, 2021
@busunkim96 busunkim96 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 10, 2021
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Jun 10, 2021
call = self._start_rpc(iter(request_generator), metadata=self._rpc_metadata)
# use metadata from self._start_rpc if no other metadata is specified
else:
call = self._start_rpc(iter(request_generator))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I set up BidiRPC with LoggingV2Transport().tail_log_entries, that's the unwrapped method, not the wrapped method. Will this still get the metadata from the wrapped method?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I tried passing in LoggingServiceV2Client().tail_log_entries and it didn't seem to work.

I'm actually not sure how pubsub passes in the wrapped method successfully.
https://github.com/googleapis/python-pubsub/blob/e907f6e05f59f64a3b08df3304e92ec960997be6/google/cloud/pubsub_v1/subscriber/_protocol/streaming_pull_manager.py#L524

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I set up BidiRPC with LoggingV2Transport().tail_log_entries, that's the unwrapped method, not the wrapped method. Will this still get the metadata from the wrapped method?

I believe the wrapped one has to be passed in (the wrapper adds the metadata to the call)

also, I tried passing in LoggingServiceV2Client().tail_log_entries and it didn't seem to work.

I'm actually not sure how pubsub passes in the wrapped method successfully.

Hmm I'll go back and poke at what is actually expected to be passed in to Bidi.

Copy link
Contributor

@tseaver tseaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. I'll let @plamut comment on possible Pub/sub interactions.

@busunkim96
Copy link
Contributor Author

Looking at Firestore vs. Pub/Sub, I see Firestore is passing the method in the gRPC transport and Pub/Sub is passing the method in the GAPIC client.

I wonder if that contributed to the behavior difference between the two that requires this workaround? (Also see: googleapis/python-pubsub#93 (comment) and #30)

@jameslynnwu For it "not working" when LoggingServiceV2Client().tail_log_entries was passed did it hang, error, or something else?

@jameslynnwu
Copy link

Great find!
When I pass in LoggingServiceV2Client().tail_log_entries, it doesn't respond and the request does not seem to be initiated.

However, if I use the same workaround, it works.

# Does not respond. Request doesn't seem to be initiated
client = LoggingServiceV2Client()
self.tail_stub = bidi.BidiRpc(client.logging.tail_log_entries)

# Works
client = LoggingServiceV2Client()
self.tail_stub = bidi.BidiRpc(client.logging.transport.tail_log_entries)

# Works
client = LoggingServiceV2Client()
client.transport._prefetch_first_result_ = False
self.tail_stub = bidi.BidiRpc(client.logging.tail_log_entries)

@plamut
Copy link
Contributor

plamut commented Jun 16, 2021

The Pub/Sub client passes the streaming_pull() method when creating a Bidi instance. This method comes from the auto-generated client and invokes the wrapped transport method.

The subscriber client does not pass any metadata on its own to ResumableBidiRpc, it just leaves it up to the (generated) transport to do the wrapping, which sets client_info on requests, among other things.


As for the changes in this PR - since no metadata is passed to BidiRpc, the default value None is used for it.

Before the PR, BidiRpc.open() called the following (sending None for metadata:

call = self._start_rpc(iter(request_generator), metadata=self._rpc_metadata)

With the changes in this PR, however, that changes:

if self._rpc_metadata:
    call = self._start_rpc(iter(request_generator), metadata=self._rpc_metadata)
# use metadata from self._start_rpc if no other metadata is specified
else:
    call = self._start_rpc(iter(request_generator))

Since metadata is None, this code now effectively becomes the following:

 call = self._start_rpc(iter(request_generator))

No explicit None metadata is passed in, thus the generated streaming_pull() method now uses its own default, which is an empty tuple.

When the (wrapped) method is eventually called, the following logic in _GapicCallable determines the metadata:

if self._metadata is not None:
    metadata = kwargs.get("metadata", [])
    # Due to the nature of invocation, None should be treated the same
    # as not specified.
    if metadata is None:
        metadata = []
    metadata = list(metadata)
    ...

If I traced this correctly, there should be no difference between passing None and an empty tuple as a kwarg. Both get transformed to an empty list, and the original metadata from the method wrapping time is not modified.

Does this answer the question?


As for passing in a wrapped vs. un-wrapepd method - the regression caused by a Firestore fix back then seems to only affect the wrapped methods, yes, as the key change lives in in the _StreamingResponseIterator (the latter wraps the stream errors, and only comes into play with the wrapped methods).

We had to introduce a flag that changes how _StreamingResponseIterator behaves, as Firestore's fix can break Pub/Sub. :)
(and Logging, if I understood @jameslynnwu correctly).

@jameslynnwu
Copy link

One more thing to note, though we probably not want to address all of this at once.

Passing in the client.logging.tail_log_entries vs client.logging.transport.tail_log_entries changes what is passed into the should_recover method of ResumableBidiRPC from either a grpc.Call or a google.api_core.exceptions.GoogleAPICallError to a grpc.Call, respectively.

If we can ensure that only client.logging.tail_log_entries is passed it that would make should_recover more predictable. Alternatively, we should update the code to always pass the same result into should_recover regardless of if client.logging.tail_log_entries or client.logging.transport.tail_log_entries is passed in.

Also, it's interesting that should_recover is called twice from callbacks and from exception.

@busunkim96
Copy link
Contributor Author

@plamut Thank you for the detailed analysis. 馃檹

I'm OOO the rest of the week so I'll come back and look at this next week.

@busunkim96
Copy link
Contributor Author

As @plamut explained above, this PR is a no-op, if you pass the wrapped method the metadata will get added (which is why Pub/Sub doesn't pass metadata explicitly to start_rpc).

I had a bit of trouble tracing this through but was able to confirm the expected headers in the gRPC logs:

export GRPC_TRACE=all
export GRPC_VERBOSITY=debug
I0625 21:17:38.217348925 2619277 chttp2_transport.cc:1363]   HTTP:0:HDR:CLI: :scheme: https
I0625 21:17:38.217356118 2619277 chttp2_transport.cc:1363]   HTTP:0:HDR:CLI: :method: POST
I0625 21:17:38.217363271 2619277 chttp2_transport.cc:1363]   HTTP:0:HDR:CLI: :authority: pubsub.googleapis.com:443
I0625 21:17:38.217370411 2619277 chttp2_transport.cc:1363]   HTTP:0:HDR:CLI: :path: /google.pubsub.v1.Subscriber/StreamingPull
I0625 21:17:38.217377702 2619277 chttp2_transport.cc:1363]   HTTP:0:HDR:CLI: x-goog-api-client: gl-python/3.8.3 grpc/1.38.0 gax/1.30.0 gccl/2.5.0

@busunkim96 busunkim96 closed this Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

client_info from start_rpc arg should be used when creating BidiRpc
5 participants