Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose API to controll lifecycle for db specfic operation like set #141

Open
afaruga-atlassian opened this issue Feb 2, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@afaruga-atlassian
Copy link
Contributor

Currently, db-replica implements special logic for Postgres specific commands set. It contains a bug- the connections released back to the pool don't have cleared set operations. E.g set session timeout isn't rolled back, and as a result, it leaks between MAIN and REPLICA connections, which results in different than expected timeouts on connections.

The special logic should be reverted.
The new API should allow customers to define connection lifecycle logic on db-specific operations.
It has to sustain current db-time effectiveness (% of traffic moved from MAIN to REPLICA)
Specific operations should pass them between connections according to whatever db-replica is doing(connection state) to achieve an effective split.
Such operations should be set per dual-connection lifecycle for all connections taken from provider, which have been actually used during this lifecycle.

Workaround:
Currently if you use Postgress set commands, you have to clean set operations while releasing it to the pool.

@afaruga-atlassian afaruga-atlassian added the bug Something isn't working label Feb 2, 2022
@wyrzyk
Copy link
Contributor

wyrzyk commented Feb 2, 2022

The special logic should be reverted.

I'd be careful with that. In our product, the change would cause a drastic efficiency drop and could lead to an incident. In other words, customers may rely on this feature even if it's buggy and never should be shipped.

Exposing a new API is one thing, and deprecating + unshipping buggy feature is another.

I see the whole change as:

  • Shipping the new API.
  • Deprecating the old one.
  • log warns that directs to the new API when the old feature is still used instead of the new API.
  • after some time (the next major release), remove the feature. Some users may ignore the deprecation and the logs. They will see the efficiency drop after the major release. We could keep the warns to make it easier to find the root cause.

@afaruga-atlassian
Copy link
Contributor Author

I totally agree that change should be properly tested, rolled out carefully, and be backward compatible.

It could be reproduced and initially, even an extra API with observability could be added to gather more data about leaked connections and set operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants