Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

feat: support retry settings and retryable codes in call context #1238

Merged
merged 37 commits into from Mar 18, 2021

Conversation

olavloite
Copy link
Contributor

@olavloite olavloite commented Nov 9, 2020

Adds support for retry settings and retryable codes for individual RPCs through ApiCallContext.

Example usage:

ApiCallContext context = GrpcCallContext.createDefault()
  .withRetrySettings(RetrySettings.newBuilder()
    .setInitialRetryDelay(Duration.ofMillis(10L))
    .setInitialRpcTimeout(Duration.ofMillis(100L))
    .setMaxAttempts(10)
    .setMaxRetryDelay(Duration.ofSeconds(10L))
    .setMaxRpcTimeout(Duration.ofSeconds(30L))
    .setRetryDelayMultiplier(1.4)
    .setRpcTimeoutMultiplier(1.5)
    .setTotalTimeout(Duration.ofMinutes(10L))
    .build())
  .withRetryableCodes(Sets.newSet(
    StatusCode.Code.UNAVAILABLE,
    StatusCode.Code.DEADLINE_EXCEEDED));

Points for consideration:

  1. This feature can only be used for RPC's that are already retryable in the default settings, as the current code does the following optimization:
    if (areRetriesDisabled(callSettings.getRetryableCodes(), callSettings.getRetrySettings())) {
    Calling withRetrySettings(...) or withRetryableCodes(...) will not change the retry behavior of those RPC's.
  2. This PR introduces a number of ContextAware... classes that accept a RetryingContext as an argument for the methods that determine whether to retry an RPC, and if so, with what settings. This ensures that this PR is not a breaking change. It would also be possible to add this behavior to existing classes, but that would mean:
    2.1. It would be a breaking change
    2.2. OR it would include multiple if (someAlgorithm instanceof ContextAwareAlgorithm) return ((ContextAwareAlgorithm) algorithm).shouldRetry(context, ...); type statements in the existing classes to check whether it has been passed an algorithm that is able to determine retry settings using a RetryingContext or not.

Fixes #1197

Allows applications to supply retry settings and retryabe codes for individual RPCs
through ApiCallContext.

Fixes googleapis#1197
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Nov 9, 2020
@codecov
Copy link

codecov bot commented Nov 9, 2020

Codecov Report

Merging #1238 (41fdc4e) into master (86d5c72) will decrease coverage by 0.15%.
The diff coverage is 84.76%.

❗ Current head 41fdc4e differs from pull request most recent head 8eab974. Consider uploading reports for the commit 8eab974 to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1238      +/-   ##
============================================
- Coverage     79.65%   79.49%   -0.16%     
- Complexity     1249     1291      +42     
============================================
  Files           209      215       +6     
  Lines          5461     5516      +55     
  Branches        464      472       +8     
============================================
+ Hits           4350     4385      +35     
- Misses          928      945      +17     
- Partials        183      186       +3     
Impacted Files Coverage Δ Complexity Δ
...m/google/api/gax/retrying/NoopRetryingContext.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...va/com/google/api/gax/retrying/RetryAlgorithm.java 87.50% <ø> (+6.25%) 9.00 <0.00> (+1.00)
...rc/main/java/com/google/api/gax/rpc/Callables.java 68.62% <50.00%> (ø) 12.00 <0.00> (ø)
...m/google/api/gax/httpjson/HttpJsonCallContext.java 70.00% <68.42%> (+0.58%) 33.00 <6.00> (+5.00)
.../java/com/google/api/gax/grpc/GrpcCallContext.java 84.17% <72.22%> (-1.43%) 53.00 <6.00> (+5.00) ⬇️
.../gax/retrying/ContextAwareBasicRetryingFuture.java 87.50% <87.50%> (ø) 5.00 <5.00> (?)
...rying/ContextAwareCallbackChainRetryingFuture.java 87.50% <87.50%> (ø) 5.00 <5.00> (?)
...m/google/api/gax/retrying/BasicRetryingFuture.java 92.59% <88.88%> (-0.55%) 29.00 <4.00> (+4.00) ⬇️
...i/gax/rpc/ContextAwareStreamingRetryAlgorithm.java 89.47% <89.47%> (ø) 5.00 <5.00> (?)
...e/api/gax/retrying/ContextAwareRetryAlgorithm.java 93.33% <93.33%> (ø) 10.00 <10.00> (?)
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 86d5c72...8eab974. Read the comment docs.

@olavloite olavloite changed the title [WIP] feat: support retry settings and retryable codes in call context feat: support retry settings and retryable codes in call context Nov 11, 2020
@olavloite olavloite marked this pull request as ready for review November 11, 2020 19:31
@olavloite olavloite requested review from a team as code owners November 11, 2020 19:31
@olavloite
Copy link
Contributor Author

@igorbernstein2 Friendly ping on this PR. Would you mind taking a look? If you're not the best person to weigh in on this, then please let me know.

cc @skuruppu

@noahdietz
Copy link
Contributor

This looks great (to my untrained Java eye). I'd like @igorbernstein2 and/or @vam-google / @miraleung to review it though, they are the more Java savvy and experienced in this codebase.

Of course, @chingor13 should be approve of the added functionality (I already polled these folks prior to starting this work, but just for posterity...).

this.channel = channel;
this.callOptions = Preconditions.checkNotNull(callOptions);
this.timeout = timeout;
this.streamWaitTimeout = streamWaitTimeout;
this.streamIdleTimeout = streamIdleTimeout;
this.channelAffinity = channelAffinity;
this.extraHeaders = Preconditions.checkNotNull(extraHeaders);
this.retrySettings = retrySettings;
this.retryableCodes = retryableCodes == null ? null : ImmutableSet.copyOf(retryableCodes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this nullable? per Effective Java, empty collections are preferred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a difference in this case:

  • null means 'do not override the default settings`.
  • An empty set means 'override the default settings by disabling retrying'.

I've added that to the documentation of ApiCallContext#getRetryableCodes().

gax/src/test/java/com/google/api/gax/rpc/RetryingTest.java Outdated Show resolved Hide resolved
gax/src/test/java/com/google/api/gax/rpc/RetryingTest.java Outdated Show resolved Hide resolved
Copy link
Contributor

@elharo elharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do the docs for retries live? Does something need to be added to package-info?

Copy link
Contributor

@elharo elharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Google policy is that existing violations of the style guide do not justify further violations. All new code must follow the style guide, irrespective of existing code.

@olavloite olavloite requested a review from elharo January 7, 2021 13:26
@olavloite
Copy link
Contributor Author

@elharo Thanks for the review. I (think I) have addressed all your comments. PTAL.

@olavloite
Copy link
Contributor Author

@elharo Friendly ping on this PR. I addressed your comments, would you mind taking a second look?

@miraleung miraleung self-requested a review February 4, 2021 18:39
Copy link
Contributor

@vam-google vam-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with few (hopefully last ones) comments. Basically I still hope we can get rid of the Ignore* classes.

Also, optionally, please consider having some clear strategy for all of the context-aware and context-unaware methods:

  1. Make context-unaware methods deprecated,
  2. make sure that one implementation always calls another one to avoid code duplication (if context-unaware implementaitons got deprecated then it should be context-unaware call context-aware, with context probably as null)
  3. make sure that methods documentation reference each other and explains the situation (that both methods are the same thing, and context-awareness is the only difference)

* RetryingContext}. This is used to wrap {@link ResultRetryAlgorithm} instances to create a {@link
* ResultRetryAlgorithmWithContext} when one is required.
*/
class IgnoreRetryingContextResultRetryAlgorithm<ResponseT>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the new constructor is implemented in RetryAlgorithm (the one which implementation you listed in the comment) seems being kind of against OOP principles, since the constructor does accept a generic interface, but then instantiates the specific instances of the interface instead. I understand that sometimes we may need to do something like this, but I'm not sure it is that case.

Looks like the Ignore* classes exist only because RetryAlgorithm implementation changed the aggregated fields type to the context-aware interfaces:

  private final ResultRetryAlgorithmWithContext<ResponseT> resultAlgorithm;
  private final TimedRetryAlgorithmWithContext timedAlgorithm;

A the same time this PR adds the full copy of all the existing methods in RetryAlgorithm but with the context extra parameter now. I.e. this means that this class can be used eitehr as context-aware one or as context-unaware, but not in both different context. WDYT about instead of adding the new Ignore* classes, just make RetryAlgorithm just have both types of algorithms aggregated:

  private final ResultRetryAlgorithm<ResponseT> resultAlgorithm;
  private final TimedRetryAlgorithm timedAlgorithm;
  private final ResultRetryAlgorithmWithContext<ResponseT> resultAlgorithmWithContext;
  private final TimedRetryAlgorithmWithContext timedAlgorithmWithContext;

And use each of the algorithms accordingly depending in which method of REtryAlgorithm class it is used (context-aware or not).

In general, I'm just trying to avoid having classes like IgnoreRetryingContextReulstRetryAlgorithm because they are obviously a hack (even check the name, it is too long an unreadable). So here I'm trying to trade it to simply having context-aware fields as null for older constructor, and it is ok to throw an exception if context-aware methods are used for RetryAlgorithm initialized in legacy way.

Also, as you suggest in your comments, please go ahead and deprecate the old constructor for RetryAlgorithm and relevant stuff (maybe even the old context-unaware methods as well).

* @param previousResponse response returned by the previous attempt
* @param previousSettings previous attempt settings
*/
public TimedAttemptSettings createNextAttempt(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use access modifiers (public) on interface methods, as they are redundant.

Throwable previousThrowable,
ResponseT previousResponse,
TimedAttemptSettings previousSettings) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you call createNextAttempt() (no context one) here for consistency with the shouldRetry implementation?

extends ResultRetryAlgorithm<ResponseT> {

/**
* Creates a next attempt {@link TimedAttemptSettings}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional (but please consider doing it). It seems like a lot of method duplication is happening here (context-aware vs context-unaware methods). Please consider replacing documentation for every context-unaware method in all of the classes with a link to the context-aware one with the explanation, that this one (context-unaware one) is doing the same but without the context. This will not only can make it shorter in terms of docs, but will also provide a clear picture to the users of these classes why there are both types of methods (which may be really confusing).

Also, please consider deprecating the context-unaware methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I've updated the documentation of the context-unaware interfaces and their methods to reference the context-aware versions.
  2. I've deprecated all the context-unaware methods

* <p>The attempt settings will be reset if the stream attempt produced any messages.
*/
@Override
public TimedAttemptSettings createNextAttempt(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these methods (context-aware and context-unaware methods in streaming algorithm) call each other, like it is done in the non-streaming versions? This is to reduce code dupliation and make the implementation more robust in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (this is now possible as the context-aware methods accept null values for the context)

* StatusCode.Code.DEADLINE_EXCEEDED));
* }</pre>
*/
ApiCallContext withRetrySettings(RetrySettings retrySettings);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we mark these new methods as BetaApi. This an addition to a fundamental interface, whatever we add here we will have to stick with for a really long time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@vam-google vam-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please cleanup the deprecation stuff (more details in the comment). Sorry for confusing with it.

* thrown instead
* @param nextAttemptSettings attempt settings, which will be used for the next attempt, if
* accepted
* @throws CancellationException if the retrying process should be canceled
* @return {@code true} if another attempt should be made, or {@code false} otherwise
*/
public boolean shouldRetry(
Throwable prevThrowable, ResponseT prevResponse, TimedAttemptSettings nextAttemptSettings)
Throwable previousThrowable,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for confusing with my @Deprecate request. I think we should just mark the methods and constructors in RetryAlgorithm class as deprecated (the old ones) and make sure that the non-deprecated methos (new ones) are called by the deprecated methods (which is what is happening now, so it is all good), otherwise we will be using our own deprecated methods to implement non-deprecated methods, which is wrong.

For the rest of it, please undeprecate the methods back, as is is clear that RetryAlgorithm cannot be used in both contexts (as context-aware and as unaware), but the actual interfaces (timed and result) kind of still potentially can have meaningful implementation for all the methods, but most importantly it is too much deprecation happening on interface level, I'm afraid we are going to get too many deprecaton warnings.

So to summarize: deprecate the old constructor and methods in RetryAlgorithm class, undeprecate the other stuff (the TimedRetryAlgorithm and ResultRetryAlgorithm methods - in both places, the @Deprecated annotation and @deprecated java doc element, maybe adding in the general java doc section of each element a notion that the other method should be preferred, but it does not mean it is deprecated in a strict sense of it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@noahdietz
Copy link
Contributor

@olavloite heads up #1328 we are wrapping all RPC callables in the retry machinery which should enable the use of this feature on non-retryable RPCs once merged.

Note: #1328 has the same numbers as this PR 1238, how about that! :)

@noahdietz
Copy link
Contributor

@miraleung can you review/approve this PR again? You requested changes at one point.

@olavloite
Copy link
Contributor Author

@olavloite heads up #1328 we are wrapping all RPC callables in the retry machinery which should enable the use of this feature on non-retryable RPCs once merged.

Note: #1328 has the same numbers as this PR 1238, how about that! :)

Cool (both the feature and the numbers) :-)

@noahdietz
Copy link
Contributor

noahdietz commented Mar 17, 2021

@olavloite heads up #1328 we are wrapping all RPC callables in the retry machinery which should enable the use of this feature on non-retryable RPCs once merged.
Note: #1328 has the same numbers as this PR 1238, how about that! :)

Cool (both the feature and the numbers) :-)

BTW @olavloite if this is good to go, feel free to merge this - no need to wait on the callable wrapping change I mentioned.

Copy link
Contributor

@miraleung miraleung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please wait for others to LGTM as well.

Copy link
Contributor

@noahdietz noahdietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM this is really awesome, Knut

@noahdietz noahdietz removed the request for review from igorbernstein2 March 17, 2021 20:52
@noahdietz
Copy link
Contributor

Removed @igorbernstein2 because we had thorough reviews from others.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: provide the ability to set per-request RetrySettings
5 participants