Routing: Adds Parallel Request Hedging #4198

NaluTripician · 2023-11-27T20:50:14Z

Pull Request Template

Description

See issue #3782.

The goal of this PR is to introduce parallel request hedging + availability strategies to the .NET SDK. Users will be able to create an availability strategy with a threshold specifying when the request hedging triggers as well as a step time which will indicate how often after the availability strategy triggers parallel requests will be sent out. If the step is 0, only one parallel request will be sent out. If set to TimeSpan.0 then only all parallel requests will be sent out simutaniously.

Design

Sending out parallel requests will be done in the RequestInovkerHandler. Before a request is sent out first, we check to see if a request can be sent out with parallel request hedging. Currently, only document read requests can use this feature.

Next, the request is cloned, and parallel requests are routed to all available read regions by setting the exclude regions property in the request options. Finally, all requests are sent out with the appropriate delay and once the SDK receives a response, all in flight parallel requests are canceled.

Parallel Hedging

When Building a new CosmosClient there will be an option to include Parallel hedging in that client.

CosmosClient client = new CosmosClientBuilder("connection string")
    .WithAvailabilityStrategy(
        new ParallelHedging(
            threshold: TimeSpan.FromMilliseconds(500)))
    .Build();

or

CosmosClientOptions options = new CosmosClientOptions()
{
    AvailabilityStrategyOptions
     = new AvailabilityStrategyOptions(
        new ParallelHedging(
            threshold: TimeSpan.FromMilliseconds(500)))
};

CosmosClient client = new CosmosClient(
    accountEndpoint: "account endpoint",
    authKeyOrResourceToken: "auth key or resource token",
    clientOptions: options);

The example above will create a CosmosClient instance with AvailabilityStrategy enabled with at 500ms threhshold. This means that if a request takes longer than 500ms the SDK will send a new request to the backend in order of the Preferred Regions List. If still no response comes back after the step time, another parallel request will be made to the next region. The SDK will then return the first response that comes back from the backend. The threshold parameter is a required parameter can be set to any value greater than 0. There will also be options to specify all options for the AvailabilityStrategyOptions object at request level and enable or disable at request level. If no client level AvailabilityStrategy is set, adding AvailabilityStrategyOptions to the request options will allow the request to use an AvailabilityStrategy.

Override AvailabilityStrategy:

RequestOptions requestOptions = new RequestOptions()
{
    AvailabilityStrategyOptions = new AvailabilityStrategyOptions(new ParallelHedging( TimeSpan.FromMilliseconds(400)))
};

Disabling availability strategy:

RequestOptions requestOptions = new RequestOptions()
{
    AvailabilityStrategyOptions = new AvailabilityStrategyOptions(new DisableStrategy(), enabled: false)
};

Type of change

Please delete options that are not relevant.

[] New feature (non-breaking change which adds functionality)
[] This change requires a documentation update

Closing issues

To automatically close an issue: closes #IssueNumber

…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview

Microsoft.Azure.Cosmos/src/CosmosClientOptions.cs

Microsoft.Azure.Cosmos/src/Handler/RequestInvokerHandler.cs

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs

kirankumarkolli · 2024-05-03T21:54:52Z

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs

+            CancellationTokenSource cancellationTokenSource)
+        {
+            RequestMessage clonedRequest;
+            using (clonedRequest = request.Clone(request.Trace.Parent))


Thought: Simpler model might be to also support include region as concept.
That way the caller here can set the exact one region.

The list creations are expensive.

We discussed this in Java as well - the reason why we only allowed excludedRegions is that when you allow the caller to specify a random region you need to come-up with a new error in case that region is not a valid one anymore. At least for public surface area I still think that was a good idea. If some internal API really helps with perf -that is an option. But I would get a CPU profile before making that change.

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs

…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs

Microsoft.Azure.Cosmos/src/Headers/Headers.cs

…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview

NaluTripician and others added 14 commits October 26, 2023 15:05

initial commit

ddaa261

fix

e5d809a

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

675a65d

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

560dbfb

document client restore

af63676

document client changes

72abdcb

clientContextCore fix

efd95dd

global endpoint manager fix

4ab0003

pre test changes

289e947

start of tests

833c18e

added dispose for cancellation token source

2a5f904

test changes

564704d

working test

36dd15b

more testing

f022cb3

NaluTripician marked this pull request as ready for review December 28, 2023 18:14

NaluTripician requested a review from khdang as a code owner December 28, 2023 18:14

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

9b4fcce

NaluTripician requested review from sboshra, adityasa, neildsh, kirankumarkolli, ealsur, FabianMeiswinkel and kirillg as code owners December 28, 2023 18:14

NaluTripician changed the title ~~Routing: Adds Parallel Request Hedging in Preview Mode~~ Routing: Adds Parallel Request Hedging Dec 28, 2023

NaluTripician added 3 commits December 28, 2023 13:19

removed unneeded changes

7b87763

Merge branch 'users/nalutripician/parallelHedgingPreview' of https://…

7bdbd7f

…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview

revert changes to global endpoint manager (unneeded)

b69bf71

kundadebdatta reviewed Dec 31, 2023

View reviewed changes

Microsoft.Azure.Cosmos/src/CosmosClientOptions.cs Outdated Show resolved Hide resolved

kundadebdatta reviewed Dec 31, 2023

View reviewed changes

Microsoft.Azure.Cosmos/src/Handler/RequestInvokerHandler.cs Outdated Show resolved Hide resolved

NaluTripician and others added 3 commits May 1, 2024 15:06

ALTERNATE METHOD

f0ebcbb

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

0f4b5c6

added readfeed FI operation type to tests

4e72248

kirankumarkolli reviewed May 3, 2024

View reviewed changes

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs Outdated Show resolved Hide resolved

kirankumarkolli reviewed May 3, 2024

View reviewed changes

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs Outdated Show resolved Hide resolved

kirankumarkolli reviewed May 3, 2024

View reviewed changes

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs Outdated Show resolved Hide resolved

kirankumarkolli reviewed May 3, 2024

View reviewed changes

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs Outdated Show resolved Hide resolved

NaluTripician added 7 commits May 6, 2024 09:44

requested changes and improvemtns

ff9918d

list optimization

c95a546

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

eecc2c2

fixed edge case diagnostics

4c0750b

Merge branch 'users/nalutripician/parallelHedgingPreview' of https://…

4a7282b

…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview

small fix

f1738e0

small fixes

f2512ea

kirankumarkolli reviewed May 8, 2024

View reviewed changes

...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs Outdated Show resolved Hide resolved

NaluTripician added 8 commits May 8, 2024 14:42

refactor code

8cce6e2

fixed null issues

abc212a

null refrence

07034e2

bug fixes

f49ad9e

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

25d4e25

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

5094543

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

98b8681

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

b7b7349

kirankumarkolli reviewed May 11, 2024

View reviewed changes

Microsoft.Azure.Cosmos/src/Headers/Headers.cs Outdated Show resolved Hide resolved

NaluTripician and others added 5 commits May 13, 2024 13:07

Merge branch 'master' into users/nalutripician/parallelHedgingPreview

182f8b1

changed header clone to internal

14d970f

Merge branch 'users/nalutripician/parallelHedgingPreview' of https://…

98109e0

…github.com/Azure/azure-cosmos-dotnet-v3 into users/nalutripician/parallelHedgingPreview

fixed API doc + test change

dc4f5cb

removed unused method

fa8e88a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Routing: Adds Parallel Request Hedging #4198

Routing: Adds Parallel Request Hedging #4198

NaluTripician commented Nov 27, 2023 •

edited

kirankumarkolli May 3, 2024

FabianMeiswinkel May 3, 2024

Routing: Adds Parallel Request Hedging #4198

Are you sure you want to change the base?

Routing: Adds Parallel Request Hedging #4198

Conversation

NaluTripician commented Nov 27, 2023 • edited

Pull Request Template

Description

Design

Parallel Hedging

Type of change

Closing issues

kirankumarkolli May 3, 2024

Choose a reason for hiding this comment

FabianMeiswinkel May 3, 2024

Choose a reason for hiding this comment

NaluTripician commented Nov 27, 2023 •

edited