You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ASK] Always on cross-region hedging: ~500ms possible hedging threshold (teams will tune)
[ASK] Ability to opt-in/out of request hedging feature
[ASK] Parallel hedging: Don't cancel current in-flight one, pick one which finishes first
[ASK] Region choice at request level (ability to share connections)
[ASK] Reflection based approach (fully supported)
[ASK] Available on both Preview and GA packages
Design
Initally this will be delivered with a refleciton based approach just for Teams with the goal to move to a Preview and GA package later this year. This will be an opt-in feature. Initially this will be for read requests only. Once retryable writes are implemented this feature will be looked at again to see if adding writes is viable.
Threshold Mode (Paralell Request Hedging)
Here, there will be a threshold that once reached, a a paralell request will be sent out to a different region. The request that is completed first will be used and all paralell requests after will be cancled. The threshold time should be able to be set by the user. This will be set as part of the client options and will always be on if set.
flowchart TD
A[Request] -->B(Base Request with timeout policy)
B --> Threshold{timespent < speculativeThreshold}
Threshold -->| Yes | C{success <= EndToEndTimeout}
Threshold --> | No | RemoteProcessing[Start Parallel remote requests to other regions]
C -->|Yes| D[Return result]
C -->|No response| E[Cancel and timeout]
C --> |error response| F[Retry Flow]
F --> G{timespent < EndToEndTimeout}
G --> |No| E
G --> |Yes| F
Dynamic Preferred Regions
The Prefered Regions list should be mutable. Otherwised client must be re-created during outages. The Prefered regions list should be recreated when a parallel request to a differet region is completed before the default.
Region choice at request level
As part of the RequestOptions a region can be set by both the Cx and the Request Hedging feature.
Implementation
New Files
End To End latency policy
Set threshold and opt in/out
Test Files
Updated Files
RequestOptions: Specify region
ClientOptions: mutable regions
Document client options
ClientInitialization
Replicatedresouceclient
Global Endpoint Manager
LocationCache
ClientRetryPolicy
In ReplictedResouceClient,
flowchart TD
A[Request] -->B{Check to see if we should be speculating}
B -->|Yes| C(Get Store respose)
B --> |No | D[Return result]
C --> E[For every available read endpoint clone the request and route request to new endpoint]
E --> F[Delay each cloned request by threshold]
F --> G[Use first response and cancel all other requests]
Do The same in CosmosHttpClientCore for gateway mode
Testing
Testing will be done using the FaultInjection library. To test, a FaultInjectionRule should be created, this rule should be a FaultInjectionServerErrorRule with a RESPONSE_DELAY type. This will allow all code paths to be tested.
Test Scenarios
Default behavior with speculative processing turned off
Speculative Processing turned on with default mode.
Speculative Procerssing turned on with Threshold mode.
Additional Work
After the initial work is complete, there are two areas where additional work can be done to improve the feature.
Thomson Sampling
Adding an additional Thomson Sampling mode would be a logical next step for the feature. Thomson Sampling is a probabilistic algorithm that builds a probability model from the observed latencey of each region. This method will result in a much more acurate estimate of the best result when compared to a mean based model. This will also proivde a level of confidence in which region is the best to route to. This algorithm will also improve over time. By using a Thomson Sampling based model we would hope to have even better latency with threshold mode. This would likely come a the cost of RUs.
Samples and Metrics
Adding a sample library on how to use this feature as well as metics showing the preformance bennifits of each mode would be a great addition to the feature. It would be ideal to show potential customers how this feature could be a bennifit to their application and could help with onboarding additional customers. Some metrics/figures that could be provided would be:
Latency vs Time (and showing where the latency is injected to a region)
RU cost vs Time
Latecy to each region vs Time + What region the SDK is sending requests to
P99/95/75 Latecy for each mode with constant injection of delay on local region
This sample library would also take advantage of the FaultInjectionLibrary
The content you are editing has changed. Please copy your edits and refresh the page.
Speculative Processing
Background
Design
Initally this will be delivered with a refleciton based approach just for Teams with the goal to move to a Preview and GA package later this year. This will be an opt-in feature. Initially this will be for read requests only. Once retryable writes are implemented this feature will be looked at again to see if adding writes is viable.
Threshold Mode (Paralell Request Hedging)
Here, there will be a threshold that once reached, a a paralell request will be sent out to a different region. The request that is completed first will be used and all paralell requests after will be cancled. The threshold time should be able to be set by the user. This will be set as part of the client options and will always be on if set.
Dynamic Preferred Regions
The Prefered Regions list should be mutable. Otherwised client must be re-created during outages. The Prefered regions list should be recreated when a parallel request to a differet region is completed before the default.
Region choice at request level
As part of the
RequestOptions
a region can be set by both the Cx and the Request Hedging feature.Implementation
New Files
Updated Files
In ReplictedResouceClient,
Do The same in CosmosHttpClientCore for gateway mode
Testing
Testing will be done using the
FaultInjection
library. To test, aFaultInjectionRule
should be created, this rule should be aFaultInjectionServerErrorRule
with aRESPONSE_DELAY
type. This will allow all code paths to be tested.Test Scenarios
Additional Work
After the initial work is complete, there are two areas where additional work can be done to improve the feature.
Thomson Sampling
Adding an additional Thomson Sampling mode would be a logical next step for the feature. Thomson Sampling is a probabilistic algorithm that builds a probability model from the observed latencey of each region. This method will result in a much more acurate estimate of the best result when compared to a mean based model. This will also proivde a level of confidence in which region is the best to route to. This algorithm will also improve over time. By using a Thomson Sampling based model we would hope to have even better latency with threshold mode. This would likely come a the cost of RUs.
Samples and Metrics
Adding a sample library on how to use this feature as well as metics showing the preformance bennifits of each mode would be a great addition to the feature. It would be ideal to show potential customers how this feature could be a bennifit to their application and could help with onboarding additional customers. Some metrics/figures that could be provided would be:
This sample library would also take advantage of the
FaultInjectionLibrary
Tasks
The text was updated successfully, but these errors were encountered: