New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Routing: Adds Parallel Request Hedging #4198
base: master
Are you sure you want to change the base?
Routing: Adds Parallel Request Hedging #4198
Conversation
…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview
...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
CancellationTokenSource cancellationTokenSource) | ||
{ | ||
RequestMessage clonedRequest; | ||
using (clonedRequest = request.Clone(request.Trace.Parent)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: Simpler model might be to also support include region as concept.
That way the caller here can set the exact one region.
The list creations are expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this in Java as well - the reason why we only allowed excludedRegions is that when you allow the caller to specify a random region you need to come-up with a new error in case that region is not a valid one anymore. At least for public surface area I still think that was a good idea. If some internal API really helps with perf -that is an option. But I would get a CPU profile before making that change.
...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview
...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview
Pull Request Template
Description
See issue #3782.
The goal of this PR is to introduce parallel request hedging + availability strategies to the .NET SDK. Users will be able to create an availability strategy with a threshold specifying when the request hedging triggers as well as a step time which will indicate how often after the availability strategy triggers parallel requests will be sent out. If the step is 0, only one parallel request will be sent out. If set to
TimeSpan.0
then only all parallel requests will be sent out simutaniously.Design
Sending out parallel requests will be done in the
RequestInovkerHandler
. Before a request is sent out first, we check to see if a request can be sent out with parallel request hedging. Currently, only document read requests can use this feature.Next, the request is cloned, and parallel requests are routed to all available read regions by setting the exclude regions property in the request options. Finally, all requests are sent out with the appropriate delay and once the SDK receives a response, all in flight parallel requests are canceled.
Parallel Hedging
When Building a new
CosmosClient
there will be an option to include Parallel hedging in that client.or
The example above will create a
CosmosClient
instance with AvailabilityStrategy enabled with at 500ms threhshold. This means that if a request takes longer than 500ms the SDK will send a new request to the backend in order of the Preferred Regions List. If still no response comes back after the step time, another parallel request will be made to the next region. The SDK will then return the first response that comes back from the backend. The threshold parameter is a required parameter can be set to any value greater than 0. There will also be options to specify all options for theAvailabilityStrategyOptions
object at request level and enable or disable at request level. If no client levelAvailabilityStrategy
is set, addingAvailabilityStrategyOptions
to the request options will allow the request to use anAvailabilityStrategy
.Override
AvailabilityStrategy
:Disabling availability strategy:
Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #IssueNumber