Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pass timeout to 'PartialRowsData.response_iterator' #16

Merged

Conversation

mf2199
Copy link
Contributor

@mf2199 mf2199 commented Mar 31, 2020

This offers a minimally invasive solution to the client hanging issue. As reported by the author and reproduced using the emulator, the client hangs when there is no response from the CBT server. One of the causes is the specified deadline not propagating to the corresponding gRPC calls. The proposed change links this input with the _grpc._UnaryStreamMultiCallable.__call__.timeout parameter by adding it as a keyword argument to the PartialRowsData constructor.

An alternative way would invoke gapic wrap_method to wrap the bigtable_data_client.transport.read_rows, along with the user-supplied timeout value. But that would bind Table.read_rows to gapic retry engine instead of the 'smart' retries offered by 'raw' gRPC. The latter supposedly avoids repeating the whole read sequence in a case of a transient failure, the lack of which may become an issue with large reads, when chances of error become significant while the time needed to complete the sequence gets increasingly large. Another possible approach is to call gapic's bigtable_data_client.read_rows instead of bigtable_data_client.transport.read_rows. But that would also invoke gapic retries and may cause the same problem as described above.

The timeout parameter must be somewhat greater than the value contained in self.retry, in order to avoid race-like condition and allow registering the first deadline error before invoking the retry. Otherwise there is a risk of getting stuck in an infinite loop that resets the timeout counter just before it being triggered. The increment by 1 second here is customary but should not be much less than that.

*** NOTE: The kokoro tests are bound to fail as the deadline property of the Retry class has not been exposed yet. This is being addressed in a separate PR. ***

The coverage test does not register these changes, hence adding a unit test for now has been left as an option.

Fixes #6 馃

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 31, 2020
@mf2199 mf2199 added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Mar 31, 2020
@mf2199 mf2199 added the status: blocked Resolving the issue is dependent on other work. label Apr 1, 2020
@mf2199 mf2199 changed the title fix: cbt-6 [added timeout value to PartialRowsData.response_iterator] fix: [cbt-6] Added timeout value to PartialRowsData.response_iterator. Apr 1, 2020
@mf2199 mf2199 requested a review from kolea2 April 1, 2020 18:27
@kolea2 kolea2 requested a review from crwilcox April 13, 2020 15:05
@mf2199 mf2199 marked this pull request as ready for review April 23, 2020 22:34
@mf2199
Copy link
Contributor Author

mf2199 commented May 27, 2020

@kolea2 Friendly ping.

@mf2199 mf2199 changed the title fix: [cbt-6] Added timeout value to PartialRowsData.response_iterator. fix: [cbt-6] Added timeout value to PartialRowsData.response_iterator May 28, 2020
@mf2199 mf2199 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 28, 2020
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 28, 2020
@mf2199 mf2199 removed do not merge Indicates a pull request not ready for merge, due to either quality or timing. status: blocked Resolving the issue is dependent on other work. labels Jul 23, 2020
tests/unit/test_row_data.py Outdated Show resolved Hide resolved
@@ -400,7 +400,7 @@ def __init__(self, read_method, request, retry=DEFAULT_RETRY_READ_ROWS):
self.read_method = read_method
self.request = request
self.retry = retry
self.response_iterator = read_method(request)
self.response_iterator = read_method(request, timeout=self.retry._deadline + 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this + 1? Can you add a comment explaining?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolea2 Sure, copied from the PR description.

@tseaver tseaver changed the title fix: [cbt-6] Added timeout value to PartialRowsData.response_iterator fix: pass timeout to 'PartialRowsData.response_iterator' Sep 2, 2020
@tseaver tseaver merged commit 8f76434 into googleapis:master Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bigtable: read_rows: no deadline causing stuck clients; no way to change it
5 participants