Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database HA with multiple <connection-url> values + <check-valid-connection-sql> always connecting to first database even if it is a bad instance #748

Open
bergner opened this issue Mar 15, 2022 · 3 comments

Comments

@bergner
Copy link

bergner commented Mar 15, 2022

If you stumble on this issue facing a similar problem, see the "Workaround" section at the bottom.

Setup

  • 3 node Postgresql cluster with repmgr consisting of 1 primary and 2 read replicas using streaming replication. Repmgr will automatically manage promotion of a new primary database if the primary database goes down.
  • Keycloak (hosted in Wildfly) is the Java application in question
  • <connection-url> contains jdbc:postgresql:... URLs to all 3 database servers
  • <validate-on-match> is set to true
  • <background-validation> is set to false
  • <check-valid-connection-sql> has been set to a couple of different statements in order to try to get the desired result that Wildfly/Ironjacamar only hands out connections to the primary database, regardless of which of the 3 nodes is the primary database.

Problem

The <check-valid-connection-sql> does not seem to run at all on initial connection. Instead the connection returned back to the application is the first entry in the <connection-url> list, regardless if that would be considered valid by the <check-valid-connection-sql> or not. I see no signs in JDBC Spy log output that this SQL statement actually runs. The application can end up getting connected to a read replica and repeatedly fails with this error, without ever performing the SQL connection check:

11:32:40,725 WARN  [org.keycloak.connections.jpa.updater.liquibase.lock.CustomLockService] (ServerService Thread Pool -- 60) Lock didn't yet acquired. Will possibly retry to acquire lock. Details: ERROR: cannot execute SELECT FOR UPDATE in a read-only transaction [Failed SQL: SELECT ID FROM public.databasechangeloglock WHERE ID=1000 FOR UPDATE]

Attempt 1: SELECT 1 WHERE pg_is_in_recovery() = FALSE

This seemed like the first logical attempt to distinguish the primary database from a read replica, and also the reason why I felt <validate-on-match> true would be a must. The statement just doesn't seem to get executed on initial startup. If I reconfigure <connection-url> so that the first entry is the current primary it does connect ok and I can see the <check-valid-connection-sql> being executed, but that seems to happen MUCH later than the initial connection attempt.

Attempt 2: Raise Postgresql exception on replicas

After skimming through 1000s of lines of Java code in Ironjacamar and Wildfly it seemed like a lot of the connection validation is centered around use of SQLException. For that reason I tried to change the <check-valid-connection-sql> to a statement that would raise an error on read replicas which I then hoped would propagate as an SQLException in the Java code. This was a long shot but I tried this after some testing in the psql cli:

<check-valid-connection-sql>DO $$ BEGIN IF EXISTS (SELECT 1 WHERE pg_is_in_recovery() = TRUE) THEN RAISE EXCEPTION 'Database is a read-only replica'; END IF; END$$;</check-valid-connection-sql>

Again this does not work because the valid connection check seems to happen much later and if we accidentally chose a read replica to begin with we never get to the point of checking the connection with this SQL statement.

Expected behavior

Any connection returned to Wildfly or the Java application in question should have been validated according to the declared rules.

Questions

  • Is it the expected behavior that <validate-on-match> true should mean that anyone requesting a connection would have synchronous connection validation happening before the connection is handed out to the application? I believe so but documentation is slightly vague here.
  • Does the above answer change in any way if CachedConnectionManager is used or not?
  • Could there be other ways Wildfly or a Java application gets hold of a connection than ones that perform such validation? If so do you have any suggestions on how to identify them/track them down?
  • Are there any situations/circumstances you are aware of where <check-valid-connection-sql> wouldn't be the first statement executed on a database connection?
  • Could someone roughly explain the code path for how a connection is requested, initialized and validated? I'm getting a bit lost here but I feel it is a combination of SemaphoreConcurrentLinkedDequeManagedConnectionPool, BaseWrapperManagedConnectionFactory's isValidConnection() and CheckValidConnectionSQL classes but the ValidateOnMatch logic seems surprisingly far away from the actual SQL execution and that seems strange to me so I fear I've misunderstood something.

Log snippets

In this log example we have dbserv3, dbserv2, dbserv as databases in that order in <connection-url> with dbserv being the current primary. Putting dbserv as the first entry makes the service start ok.

11:32:40,410 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [DataSource] getConnection()
11:32:40,411 TRACE [org.jboss.jca.core.connectionmanager.TxConnectionManager] (ServerService Thread Pool -- 60) Subject: null
11:32:40,412 TRACE [org.jboss.jca.core.connectionmanager.TxConnectionManager] (ServerService Thread Pool -- 60) getManagedConnection interleaving=false , tx=null
11:32:40,423 DEBUG [org.jboss.jca.core.connectionmanager.pool.idle.IdleRemover] (ServerService Thread Pool -- 60) Register pool: SemaphoreConcurrentLinkedQueueManagedConnectionPool@46c53daf[pool=KeycloakDS] (interval=1800000)
11:32:40,424 DEBUG [org.jboss.jca.core.connectionmanager.pool.idle.IdleRemover] (ServerService Thread Pool -- 60) About to notify thread: old next: 1647258460424, new next: 1647258460424
11:32:40,425 TRACE [org.jboss.jca.core.tracer.Tracer] (ServerService Thread Pool -- 60) IJTRACER-KeycloakDS-46c53daf-101-80-24674781650666718-NONE--
11:32:40,425 TRACE [org.jboss.jca.core.connectionmanager.pool.idle.IdleRemover] (IdleRemover) Result of await: true
11:32:40,426 DEBUG [org.jboss.jca.core.connectionmanager.pool.idle.IdleRemover] (IdleRemover) Notifying pools, interval: 900000
11:32:40,426 TRACE [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (ServerService Thread Pool -- 60) KeycloakDS: mcpPools={false=SemaphoreConcurrentLinkedQueueManagedConnectionPool@46c53daf[pool=KeycloakDS]}
11:32:40,429 TRACE [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (ServerService Thread Pool -- 60) getConnection(null, WrappedConnectionRequestInfo@b5b70e3[userName=keycloak])
Method: getConnection(null, WrappedConnectionRequestInfo@b5b70e3[userName=keycloak])
  Subject: null
  CRI: b5b70e3
ManagedConnectionPool:
  Class: org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreConcurrentLinkedDequeManagedConnectionPool
  Object: 46c53daf
ManagedConnectionFactory:
  Class: org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory
  Object: 65a50b7c
ConnectionManager:
  Class: org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl
  Object: 21d386cd
Pool:
  Name: KeycloakDS
  Class: org.jboss.jca.core.connectionmanager.pool.strategy.OnePool
  Object: 1b9fbf44
  FIFO: false
PoolConfiguration:
  MinSize: 0
  InitialSize: 0
  MaxSize: 20
  BlockingTimeout: 30000
  IdleTimeoutMinutes: 30
  ValidateOnMatch: true
  BackgroundValidation: false
  BackgroundValidationMillis: 60000
  StrictMin: false
  UseFastFail: false
  Decrementer: org.jboss.jca.core.connectionmanager.pool.capacity.TimedOutDecrementer
...
11:32:40,430 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Using properties: {password=--hidden--, user=keycloak}
11:32:40,431 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) added HA connection url: jdbc:postgresql://dbserv3:5432/keycloak
11:32:40,431 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) added HA connection url: jdbc:postgresql://dbserv2:5432/keycloak
11:32:40,431 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) added HA connection url: jdbc:postgresql://dbserv:5432/keycloak
11:32:40,432 DEBUG [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Default URLSelectorStrategy is being used : URLSelector@2896cbff[urls=[jdbc:postgresql://dbserv3:5432/keycloak, jdbc:postgresql://dbserv2:5432/keycloak, jdbc:postgresql://dbserv:5432/keycloak] currentUrl=null currentIndex=0]
11:32:40,432 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Trying to create a connection to jdbc:postgresql://dbserv3:5432/keycloak
11:32:40,433 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Checking driver for URL: jdbc:postgresql://dbserv3:5432/keycloak
11:32:40,433 DEBUG [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Driver loaded and instance created:org.postgresql.Driver@33b2be3d
11:32:40,436 FINE  [org.postgresql.Driver] (ServerService Thread Pool -- 60) Loading driver configuration via classloader ModuleClassLoader for Module "org.postgresql.jdbc" version 42.2.5 from local module loader @31dadd46 (finder: local module finder @4ed5eb72 (roots: /opt/jboss/keycloak/modules,/opt/jboss/keycloak/modules/system/layers/keycloak,/opt/jboss/keycloak/modules/system/layers/base))
11:32:40,438 FINE  [org.postgresql.Driver] (ServerService Thread Pool -- 60) Connecting with URL: jdbc:postgresql://dbserv3:5432/keycloak
11:32:40,452 FINE  [org.postgresql.jdbc.PgConnection] (ServerService Thread Pool -- 60) PostgreSQL JDBC Driver 42.2.5
11:32:40,453 FINE  [org.postgresql.jdbc.PgConnection] (ServerService Thread Pool -- 60)   setDefaultFetchSize = 0
11:32:40,456 FINE  [org.postgresql.jdbc.PgConnection] (ServerService Thread Pool -- 60)   setPrepareThreshold = 5
11:32:40,467 FINE  [org.postgresql.core.v3.ConnectionFactoryImpl] (ServerService Thread Pool -- 60) Trying to establish a protocol version 3 connection to dbserv3:5432
11:32:40,484 FINE  [org.postgresql.core.v3.ConnectionFactoryImpl] (ServerService Thread Pool -- 60) Receive Buffer Size is 183,680
11:32:40,484 FINE  [org.postgresql.core.v3.ConnectionFactoryImpl] (ServerService Thread Pool -- 60) Send Buffer Size is 43,520
11:32:40,579 TRACE [org.jboss.jca.core.tracer.Tracer] (ServerService Thread Pool -- 60) IJTRACER-KeycloakDS-46c53daf-101-60-24674781804923063-654b7ebd-5a00e145-
11:32:40,579 TRACE [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (ServerService Thread Pool -- 60) supplying new ManagedConnection: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@654b7ebd[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5a00e145 connection handles=0 lastReturned=1647257560579 lastValidated=1647257560579 lastCheckedOut=1647257560579 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@1b9fbf44 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@46c53daf[pool=KeycloakDS] xaResource=LocalXAResourceImpl@4db65b7e[connectionListener=654b7ebd connectionManager=21d386cd warned=false currentXid=null productName=PostgreSQL productVersion=13.5 jndiName=java:jboss/datasources/KeycloakDS] txSync=null]
11:32:40,580 TRACE [org.jboss.jca.core.tracer.Tracer] (ServerService Thread Pool -- 60) IJTRACER-KeycloakDS-46c53daf-101-1-24674781805751823-654b7ebd--
11:32:40,580 TRACE [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (ServerService Thread Pool -- 60) Got connection from pool: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@654b7ebd[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5a00e145 connection handles=0 lastReturned=1647257560579 lastValidated=1647257560579 lastCheckedOut=1647257560579 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@1b9fbf44 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@46c53daf[pool=KeycloakDS] xaResource=LocalXAResourceImpl@4db65b7e[connectionListener=654b7ebd connectionManager=21d386cd warned=false currentXid=null productName=PostgreSQL productVersion=13.5 jndiName=java:jboss/datasources/KeycloakDS] txSync=null]
11:32:40,580 TRACE [org.jboss.jca.core.connectionmanager.listener.TxConnectionListener] (ServerService Thread Pool -- 60) No transaction, no need to enlist: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@654b7ebd[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@5a00e145 connection handles=0 lastReturned=1647257560579 lastValidated=1647257560579 lastCheckedOut=1647257560579 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@1b9fbf44 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@46c53daf[pool=KeycloakDS] xaResource=LocalXAResourceImpl@4db65b7e[connectionListener=654b7ebd connectionManager=21d386cd warned=false currentXid=null productName=PostgreSQL productVersion=13.5 jndiName=java:jboss/datasources/KeycloakDS] txSync=null]
11:32:40,588 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java.sql.Connection#beginRequest has been invoked
11:32:40,588 TRACE [org.jboss.jca.core.tracer.Tracer] (ServerService Thread Pool -- 60) IJTRACER-KeycloakDS-46c53daf-101-40-24674781813967928-654b7ebd-de08f7b-
11:32:40,588 TRACE [org.jboss.jca.core.connectionmanager.listener.TxConnectionListener] (ServerService Thread Pool -- 60) [TxConnectionListener@654b7ebd] registerConnection: org.jboss.jca.adapters.jdbc.jdk8.WrappedConnectionJDK8@de08f7b [size=1] ([org.jboss.jca.adapters.jdbc.jdk8.WrappedConnectionJDK8@de08f7b])
11:32:40,588 TRACE [org.jboss.jca.core.api.connectionmanager.ccm.CachedConnectionManager] (ServerService Thread Pool -- 60) registering connection from connection manager: org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl@21d386cd, connection : org.jboss.jca.adapters.jdbc.jdk8.WrappedConnectionJDK8@de08f7b, key: org.jboss.as.connector.deployers.ra.processors.CachedConnectionManagerSetupProcessor$CachedConnectionManagerSetupAction@3e0ca535
11:32:40,588 TRACE [org.jboss.jca.core.tracer.Tracer] (ServerService Thread Pool -- 60) IJTRACER-KeycloakDS-46c53daf-101-92-24674781814642034-654b7ebd-de08f7b-org.jboss.as.connector.deployers.ra.processors.CachedConnectionManagerSetupProcessor$CachedConnectionManagerSetupAction@3e0ca535
# Lots of getMetadata() + Unlock/Owner/Lock cut out here for brevity
11:32:40,590 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,602 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Unlock: HeldByCurrentThread: Yes, Locked: Yes, HoldCount: 1, QueueLength: 0
11:32:40,602 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Owner: Thread[ServerService Thread Pool -- 60,5,ServerService ThreadGroup]
11:32:40,602 TRACE [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (ServerService Thread Pool -- 60) Lock: HeldByCurrentThread: No, Locked: No, HoldCount: 0, QueueLength: 0
11:32:40,602 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getAutoCommit()
11:32:40,603 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] setAutoCommit(false)
11:32:40,604 FINE  [org.postgresql.jdbc.PgConnection] (ServerService Thread Pool -- 60)   setAutoCommit = false
11:32:40,605 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,621 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] createStatement()
11:32:40,625 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] executeQuery(select setting from pg_settings where name = 'edb_redwood_date')
11:32:40,634 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,635 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] close()
11:32:40,635 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] isClosed()
11:32:40,636 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] close()
11:32:40,636 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,642 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,643 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,643 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,644 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,645 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getCatalog()
11:32:40,646 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,653 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,656 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,657 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,657 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getAutoCommit()
11:32:40,657 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,658 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] rollback()
11:32:40,660 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,660 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,665 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,665 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,667 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] createStatement()
11:32:40,667 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,684 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] executeQuery(select count(*) from public.databasechangeloglock)
11:32:40,690 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,690 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getMetaData()
11:32:40,693 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getInt(1)
11:32:40,693 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] wasNull()
11:32:40,694 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,694 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] close()
11:32:40,694 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] isClosed()
11:32:40,694 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] close()
11:32:40,694 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,694 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,695 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,696 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] createStatement()
11:32:40,696 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,697 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] executeQuery(SELECT ID FROM public.databasechangeloglock)
11:32:40,699 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,699 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getMetaData()
11:32:40,700 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getObject(1)
11:32:40,701 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,701 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getMetaData()
11:32:40,702 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getObject(1)
11:32:40,702 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,703 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getMetaData()
11:32:40,703 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] getObject(1)
11:32:40,703 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] next()
11:32:40,704 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [ResultSet] close()
11:32:40,704 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] isClosed()
11:32:40,706 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] close()
11:32:40,706 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,708 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getAutoCommit()
11:32:40,709 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] commit()
11:32:40,710 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] createStatement()
11:32:40,711 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,712 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] close()
11:32:40,712 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,713 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getAutoCommit()
11:32:40,713 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] commit()
11:32:40,714 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] createStatement()
11:32:40,714 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] getMetaData()
11:32:40,716 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] execute(SELECT ID FROM public.databasechangeloglock WHERE ID=1000 FOR UPDATE)
11:32:40,724 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Statement] close()
11:32:40,724 DEBUG [jboss.jdbc.spy] (ServerService Thread Pool -- 60) java:jboss/datasources/KeycloakDS [Connection] isClosed()
11:32:40,725 WARN  [org.keycloak.connections.jpa.updater.liquibase.lock.CustomLockService] (ServerService Thread Pool -- 60) Lock didn't yet acquired. Will possibly retry to acquire lock. Details: ERROR: cannot execute SELECT FOR UPDATE in a read-only transaction [Failed SQL: SELECT ID FROM public.databasechangeloglock WHERE ID=1000 FOR UPDATE]
...then similar error and trace/debug keeps repeating over and over.

Further debugging and pinpointing

Afaict the gory details of the call chain to get a connection to a data source is shown below but in essence it boils down to a long chain of calls starting frmo WrapperDataSource's getConnection() without arguments ending up calling the LocalManagedConnection constructor with the JDBC connection to the first database listed in <connection-url>. LocalManagedConnection constructor does not lead to immediate validation of the connection and hence no exception is thrown and the second URL in <connection-url> is not checked (LocalManagedConnectionFactory tries the other URLs if the first one fails, but if doesn't fail) because the JDBC connection to a read replica is still considered an ok connection at that stage.

WrapperDataSource {Referenceable, DataSource, Serializable} < JBossWrapper
    getConnection() -> Connection
        try {
            WrappedConnection wc = (WrappedConnection) cm.allocateConnection(mcf, defaultCRI);
            wc.setDataSource(this);
            return wc;
        } catch (ResourceException re) { throw new SQLException(re); }
    cm : ConnectionManager (TxConnectionManagerImpl)

interface ConnectionManager {org.jboss.jca.core.api.connectionmanager.ConnectionManager, ConnectionCacheListener, GracefulShutdown},
TxConnectionManagerImpl {TxConnectionManager {ConnectionManager, TransactionTimeoutConfiguration} < AbstractConnectionManager {ConnectionManager}
    AbstractConnectionManager.allocateConnection(ManagedConnectionFactory mcf, ConnectionRequestInfo cri) -> Object
        // mcf is a LocalManagedConnectionFactory
        Subject subject = getSubject();
        ConnectionListener cl = getManagedConnection(subject, cri);
        reconnectManagedConnection(cl);
        Object connection = null;
        try { connection = cl.getManagedConnection().getConnection(subject, cri); }
        catch (Throwable t) { error handling + throw new ResourceException }
        registerAssociation(cl, connection);
        if (cachedConnectionManager != null) { cachedConnectionManager.registerConnection(this, cl, connection); }
        return connection;
    getManagedConnection(Subject subject, ConnectionRequestInfo cri) -> ConnectionListener (TxConnectionListener)
        return super.getManagedConnection(trackByTransaction, subject, cri);
    AbstractConnectionManager.getManagedConnection(Transaction transaction, Subject subject, ConnectionRequestInfo cri) -> ConnectionListener
        // pool is a OnePool
        try { return pool.getConnection(transaction, subject, cri); }
        catch (ResourceException e) { error handling with retry, set failure = last exception }
        catch (Exception e) { failure = e }
        throw new ResourceException(..., failure)

OnePool < AbstractPrefillPool {PrefillPool} < AbstractPool {Pool}
    AbstractPool.getConnection(Transaction trackByTransaction, Subject subject, ConnectionRequestInfo cri) -> ConnectionListener
        ManagedConnectionPool mcp = getManagedConnectionPool(key, subject, cri);
        // mcp is a SemaphoreConcurrentLinkedDequeManagedConnectionPool
        if (trackByTransaction == null || transactionKey == null) { return getSimpleConnection(subject, cri, mcp); }
    AbstractPool.getSimpleConnection(final Subject subject, final ConnectionRequestInfo cri, final ManagedConnectionPool mcp) -> ConnectionListener
        ConnectionListener cl = mcp.getConnection(subject, cri);
        log.tracef("Got connection from pool: %s", cl);
        return cl;

SemaphoreConcurrentLinkedDequeManagedConnectionPool < ManagedConnectionPool < IdleConnectionRemovalSupport
    getConnection(Subject subject, ConnectionRequestInfo cri) -> ConnectionListener (TxConnectonListener)
        // logs lots of details of the objects and settings used, starting with "getConnection"
        do {
            // get first ConnectionListenerWrapper (clw) from queue clq
            // check pool for already valid connection
            // this shouldn't result in a valid connection because we have none
        } while (clq.size() > 0);
        try {
            // No, the pool was empty, so we have to make a new one.
            clw = new ConnectionListenerWrapper(createConnectionEventListener(subject, cri), true, true);
            clw.setCheckedOut(true);
            cls.put(clw.getConnectionListener(), clw);
            log.tracef("supplying new ManagedConnection: %s", clw.getConnectionListener());
            prefill();
            return clw.getConnectionListener();
        } catch (Throwable t) {
            // destroy clw and throw ResourceException
        }
    createConnectionEventListener(Subject subject, ConnectionRequestInfo cri) -> ConnectionListener
        // mcf is a LocalManagedConnectionFactory and it creates LocalManagedConnection
        ManagedConnection mc = mcf.createManagedConnection(subject, cri);
        try {
            ConnectionListener cl = cm.createConnectionListener(mc, this);
            poolSize.incrementAndGet();
            return cl;
        } catch (ResourceException re) {
            // cleanup and rethrow exception
        }

LocalManagedConnectionFactory < BaseWrapperManagedConnectionFactory {ManagedConnectionFactory, ValidatingManagedConnectionFactory, ResourceAdapterAssociation
    createManagedConnection(Subject subject, ConnectionRequestInfo cri) -> ManagedConnection (LocalManagedConnection)
        ...
        return getHALocalManagedConnection(props, copy)
    getHALocalManagedConnection(final Properties props, final Properties copy)
        while (urlSelector.hasMore()) {
            String url = urlSelector.active();
            log.tracef("Trying to create a connection to %s", url);
            try { return createLocalManagedConnection(url, props, copy); }
            catch (Exception e) { log.errorCreatingConnection(url, e); urlSelector.fail(url); }
        }
        urlSelector.reset();
        throw new ResourceException(...);
    createLocalManagedConnection(final String url, final Properties props, final Properties copy) -> LocalManagedConnection
        // This is very confusing. We started with a DataSource and as such dataSourceClass will be set
        // here but this also has code that would cause 5 level self recursion back to DataSource's getConnection(user, password)
        // so I have to assume we go with the other code path.
        if (dataSourceClass != null && !copy.isEmpty()) {
            // This code path could be self-recursion madness as far as I can tell.
            DataSource d = getDataSource();
            con = d.getConnection(copy.getProperty("user"), copy.getProperty("password"));
            if (con == null) throw new ResourceException(bundle.unableToCreateConnectionFromDataSource());
        } else if (driverClass != null) {
            Driver d = getDriver(url);
            con = d.connect(url, copy);
            if (con == null) throw new ResourceException(bundle.wrongDriverClass(d.getClass().getName(), url));
        }
        return new LocalManagedConnection(this, con, props, transactionIsolation, preparedStatementCacheSize);

LocalManagedConnection {LocalTransaction} < BaseWrapperManagedConnection
    LocalManagedConnection(final LocalManagedConnectionFactory mcf, final Connection con, final Properties props, final int transactionIsolation, final int p
        super(mcf, con, props, transactionIsolation, psCacheSize);
    BaseWrapperManagedConnection(final BaseWrapperManagedConnectionFactory mcf, final Connection con, Properties props, final int transactionIsolation, final
        // This constructor does not consider the <validation> configuration
        // It does however do this:
        if (mcf.getNewConnectionSQL() != null) {
            Statement s = con.createStatement();
            try { s.execute(mcf.getNewConnectionSQL()); }
            finally { s.close(); }
        }

What I can seemingly try here is to set <new-connection-sql> to the earlier <check-valid-connection-sql> that raised a Postgresql exception and then maybe get past this initial connection problem. The ValidateOnMatch logic in SemaphoreConcurrentLinkedDequeManagedConnectionPool only seems to come into play for already established connections in the connection pool and not for the initial connection.

Workaround

The below data source configuration is the only one I've found that seems to actually work. Both <new-connection-sql> and <check-valid-connection-sql> needs to be set to SQL statements that cause an exception to be thrown on the replica databases. This is because the <validate-on-match> logic is too far away from the logic that tries other URLs. I still consider this to be an issue in the Ironjacamar library but at least this workaround is a viable alternative for Postgresql. You will have some ugly stack traces in your log file but at least those stack traces are clear in the sense that they contain the "Database is a read-only replica" message.

<datasource jndi-name="java:jboss/datasources/KeycloakDS" pool-name="KeycloakDS" enabled="true" use-java-context="true" use-ccm="true">
    <connection-url>jdbc:postgresql://dbserv3:5432/keycloak|jdbc:postgresql://dbserv2:5432/keycloak|jdbc:postgresql://dbserv:5432/keycloak</connection-url>
    <driver>postgresql</driver>
    <new-connection-sql>DO $$ BEGIN IF EXISTS (SELECT 1 WHERE pg_is_in_recovery() = TRUE) THEN RAISE EXCEPTION 'Database is a read-only replica'; END IF; END$$;</new-connection-sql>
    <url-delimiter>|</url-delimiter>
    ...
    <validation>
        <check-valid-connection-sql>DO $$ BEGIN IF EXISTS (SELECT 1 WHERE pg_is_in_recovery() = TRUE) THEN RAISE EXCEPTION 'Database is a read-only replica'; END IF; END$$;</check-valid-connection-sql>
        <validate-on-match>true</validate-on-match>
        <background-validation>false</background-validation>
        <background-validation-millis>60000</background-validation-millis>
        <exception-sorter class-name="org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLExceptionSorter"/>
    </validation>
</datasource>
@aamirraza
Copy link

Did you try setting targetServerType to primary?

@bergner
Copy link
Author

bergner commented Mar 28, 2022

I did not try that, I was not even aware of its existence. It would mean using a single <connection-url> instead with jdbc:postgresql://node1,node2,node3/accounting?targetServerType=master as documented here https://jdbc.postgresql.org/documentation/head/connect.html

And yes that might be another possible workaround but the JDBC driver for Postgresql would afaict need to enforce the following:

  • On initial connection discover the master
  • Either the JDBC driver needs to on its own validate the connection on every use and transparently switch to a different node, or we still need the <check-valid-connection-sql> to fail so a new connection is initialized

I don't really see this issue as a Postgresql issue though, I see it as a general problem that non-validated connections are handed out to the Java application. Partly this could also be seen as a documentation problem where <check-valid-connection-sql> and <new-connection-sql> documentation and examples should be much more clear about the need for exceptions to be thrown in order for a connection to be considered "bad". It is imho very unintuitive that a new connection does not go through the <check-valid-connection-sql> + <validate-on-match> check before being considered a valid connection.

@aamirraza
Copy link

aamirraza commented Mar 28, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants