Support for automated failovers #877
Replies: 2 comments 1 reply
-
I do think PgBouncer should be having a failover solution, but I'm not sure the one proposed here is the direction we should go. I think @dpirotte has a follow up PR to #736 in mind, which would implement target_session_attrs just like libpq does. This would have a few benefits over the solution suggested here IMHO:
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Hi folks,
I would like to propose a feature[0] to PgBouncer to allow connecting to a new node if a database in the
databases
section is detected as down. This is especially useful in a single writer / multiple reader cluster when the writer is down during a manual or unexpected failover.In my implementation, PgBouncer first attempts to connect to the configured writer. If the writer is detected as down, a connection is opened to the configured reader (
reader_hostname
in thepgbouncer
config section). A new query introduced calledtopology_query
is used to get the hostname of the promoted writer from thereader_hostname
. If the node we connect to has been promoted to a writer, then the client connection resumes. Otherwise, we disconnect from the reader, and connect to the promoted writer we received as a response to thetopology_query
. Once that connection completes, the connection resumes.With the solution proposed in my PR, I was seeing the following failover downtime assuming promotion of readers was instant and there was no replication lag:
Another possible solution I was considering is to use polling (
select is_pg_in_recovery();
) to a list of configured nodes to detect who is the new writer. The downside of this solution is you could be connecting to N nodes until you detect the promoted writer.I am also toying with the idea of pre-connecting to the reader nodes from a configuration and reusing those open connections once downtime to the writer node is detected. This should reduce failover time to hundreds of milliseconds since we will not need to set up SCRAM/TLS before getting the topology.
[0] gitstashpop@e8fd31d
Configuration changes for testing
Beta Was this translation helpful? Give feedback.
All reactions