Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The SDK can't auto detect the consistency level change during runtime #4442

Open
blankor1 opened this issue Apr 19, 2024 · 6 comments
Open

Comments

@blankor1
Copy link

blankor1 commented Apr 19, 2024

Describe the bug
When the SDK is running, if the database account's consistency level changed, the SDK can't detect this change and apply to the runtime behavior.

To Reproduce
For example, the db consistency level is session, the client is running.
Change the db consistency level to strong.

Expected behavior
Without stop and restart the client, the RU consumption for read requests will double, the write latency will increase which means the strong consistency is applied.

Actual behavior
The read request's RU consumption and the write request's latency is still same as before, which indicate that the client is still working in the Session consistency.

Only when I stop the client, and restart the application, the client will apply the new consistency level.

Environment summary
SDK Version: exists in all version including the latest 3.39
OS Version: windows

Additional context
Add any other context about the problem here (for example, complete stack traces or logs).

@blankor1
Copy link
Author

I just notices that this behavior is documented...

Then, I would like to make this as a feature requests. I see a feature request that is related to this case: #2117

Thanks!

@ealsur
Copy link
Member

ealsur commented Apr 19, 2024

Thanks for the feedback @blankor1.

Can you please explain in which scenario would the consistency be changed and the application not restarted?

If you are changing from Strong to Session, you need changes in the application, the same when going from Session to Strong or Bounded Staleness.

Going from Strong to Bounded Staleness also might require code changes.

In all cases, for the changes to be deployed, you need to restart the application.

@blankor1
Copy link
Author

For us, in many cases, this feature is to make it more convenient to use the SDK. Through my understanding, only changing consistency level to involving Session and also respect session token will need code change. Other case seems don't need code change (Please correct me if I'm wrong)

  1. Change Bounded Staleness to Strong, to guarantee all region Read Your Write
  2. Change Session to Strong, to guarantee all region Read Your Write
  3. Degrade the consistency level back after temporarily increase the Consistency level.

I understand that changing other consistency level to session may need code change since that will need to respect the session token to guarantee Read Your Write. But could you help explain on what kind of code change is needed for from Session to Strong and Bounded Staleness. And from Strong to Bounded Staleness? Or point me to any doc for reference?

@ealsur
Copy link
Member

ealsur commented Apr 22, 2024

Change Bounded Staleness to Strong, to guarantee all region Read Your Write

Strong and Bounded don't have differences on the write path but they do on the read path. When reading Strong, the read is a read that is already committed in all regions. Bounded Staleness is not, there is a staleness window. So your application needs to address this potential scenario somehow. You could attempt to read a document and the document might not exist or receive a stale version. So you presumably had code around these expectations. In Strong, this should not happen and if you get a 404, then that document doesn't exist. The code receiving this cannot be the same?

Change Session to Strong, to guarantee all region Read Your Writ

For Session, you need to have sticky routing or passing the session token somehow or maintaining it for the user. If there is no Session Token (maybe the user did not do a Write yet), you could get 404s/old data because then the consistency is Eventual. All that code is not needed for Strong and should be removed?

Degrade the consistency level back after temporarily increase the Consistency level.

Not sure I understand this scenario. You mean flipping the consistency temporarily? But during that time, whatever it is, unless you change the code (like any of the previous examples), your app might be broken? If you want to downgrade consistency as an opt-in, you can do it at the request level through the RequestOptions.

@blankor1
Copy link
Author

blankor1 commented Apr 23, 2024

Thank you for the explanation! Then I guess the point of requiring the app to restart might also be to remind users that the potential code change is needed.

I still have some questions about raising the consistency level:
For Change Session to Strong or Bounded Staleness, will there be any drawback if we don't remove the session token part? I tested that the session token always exists for all consistency level. Is it only functionally work in the Session consistency? (For example, I keep the code for pass the session token, and raise the consistency level to Bounded staleness or Strong, will there make any difference comparing to removing that part of code?)

If I'm using the app service with Session Affinity enabled which can guarantee sticky routing (so I don't need to pass the session token), then changing the Session to Strong will have no difference in the code level?
In this case, if my read previous write request is going to a read region other than the write region, can it still guarantee get the latest version write record without using the session token?

@ealsur
Copy link
Member

ealsur commented Apr 23, 2024

I think this thread has great questions, I'll see if I can push some of them to the actual docs.

The service can return the SessionToken because it's part of the default response, and I agree that there might be some scenarios where the need for a restart might be "less" required than others. For example, if someone has 1 instance of the app running on a single machine, without any other architecture to support it, on an account with a single region, versus someone with a Redis cache to store user Session Token with load balancing distribution of requests across a farm of machines and deployed across multiple regions with local reads, the second one will require more adjustments than the first one. In a general sense of things, we can say that most of the apps will require some form of adjustment, so when documentation is written, the tendency is to generalize I guess. In case of the SDK, the ROI of the effort to wire this change throughout all the required places only to cover a small subset of scenarios might not be worth it (which was the discussion on the other linked issue).

if my read previous write request is going to a read region other than the write region, can it still guarantee get the latest version write record without using the session token?

On Strong, the Write and Read do what is called a Quorum check (see https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/docs/SdkDesign.md#consistency-direct-mode). It verifies that when a write is done, the LSN (transaction identifier) committed globally matches (so all regions have it) and this can affect latency. Depending on your app expectations (for example, a CancellationToken or some timeout mechanism expecting X latency), some code might need to change. On reads, there are 2 reads in parallel and it verifies that the Global committed LSN matches. So you should be able to read it regardless of which region you are reading from without the need of sticky routing (better load balancing).

As a side note, on a Session account, if you are using sticky routing then you don't need to pass the SessionToken, but if you are not using sticky routing, you should.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants