-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inernal] Ignore: Some thoughts on hedging #4471
base: master
Are you sure you want to change the base?
Conversation
…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview
Co-authored-by: Matias Quaranta <ealsur@users.noreply.github.com>
…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview
…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the required format: "[Internal] Category: (Adds|Fixes|Refactors|Removes) Description"
Internal should be used for PRs that have no customer impact. This flag is used to help generate the changelog to know which PRs should be included. Examples:
Diagnostics: Adds GetElapsedClientLatency to CosmosDiagnostics
PartitionKey: Fixes null reference when using default(PartitionKey)
[v4] Client Encryption: Refactors code to external project
[Internal] Query: Adds code generator for CosmosNumbers for easy additions in the future.
using (CancellationTokenSource cancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken)) | ||
{ | ||
// Get effective order of regions to route to (static once populated) | ||
IReadOnlyCollection<Uri> availableRegions = client.DocumentClient.GlobalEndpointManager.GetApplicableEndpoints(request.RequestOptions.ExcludeRegions, isReadRequest: true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is better than getting the current available regions for the scenario where an offline region becomes available again. Cavoite is that if a region is not available, a request will still be sent to it but since it will hedge on other regions too this is not much of a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two side affects
- In-case a future region is made available that might get excluded
- Possible higher latency in-case of future un-avilale was included
Both issues were present even with earlier model (may be first full request might cover it?)
//Send out hedged requests | ||
for (int requestNumber = 0; requestNumber < availableRegions.Count; requestNumber++) | ||
{ | ||
TimeSpan awaitTime = this.Threshold + TimeSpan.FromMilliseconds(requestNumber * this.ThresholdStep.Milliseconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I do not think this will give the right timespans except for the first await.
Request 0: await threshold before sending next request -- correct
Request 1: await threshold + step before sending next request -- is waiting too much time, since here the threshold amount of time should have passed then it should only be waiting the threshold step amount of time.
Does this make sense? So the time should really be: Timespan awaitTime = requestNumber == 0 ? this.Threshold : this.ThresholdStep;
This is because the WhenAny
call has the await which will complete when the Task.Delay is done (or when a request completes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used existing logic from your PR, that's a clarification I have too, feel free to update as needed
{ | ||
clonedRequest.RequestOptions ??= new RequestOptions(); | ||
|
||
clonedRequest.RequestOptions.ExcludeRegions = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to ignore the original exclude regions list here if it is provided? Also, by using location endpoint to route rather than exclude regions we would be allowing cross regional retries on the hedged requests. Is this something we want to do; this behavior is different than Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exclude regions are alreay considered in initial list population already.
here we are using single region targeting LocationEndpointToRoute
requestTasks.Remove(completedTask); | ||
|
||
(bool isNonTransient, responseMessage) = await (Task<(bool, ResponseMessage)>)completedTask; | ||
if (isNonTransient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I think we can do if(isNonTransient || requestTasks.Count == 1)
TimeSpan awaitTime = this.Threshold + TimeSpan.FromMilliseconds(requestNumber * this.ThresholdStep.Milliseconds); | ||
Task thresholdDelayTask = Task.Delay(awaitTime, cancellationToken); | ||
|
||
using (RequestMessage clonedRequest = (requestNumber == 0) ? request : request.Clone(request.Trace.Parent)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think creation of the cloned message has to be in a different helper method or else when moving outside the for loop the message will be disposed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. This needs attention
Pull Request Template
Description
Please include a summary of the change and which issue is fixed. Include samples if adding new API, and include relevant motivation and context. List any dependencies that are required for this change.
Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #IssueNumber