Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MBS-13337: Use standby Postgres instances for read-only queries #3244

Merged
merged 17 commits into from May 24, 2024

Conversation

mwiencek
Copy link
Member

@mwiencek mwiencek commented Apr 21, 2024

Problem

We don't have any consistent way to make use of our standby for read-only queries; initially this was intended for background tasks (MBS-13337), but it could be used for all types of read-only server queries.

Solution

This adds an ro_connector to the server context which is intended to be used to distribute read-only queries to our standby in production.

It requires a READONLY database configured. In production, READONLY will likely point to the pgbouncer-any (haproxy) service, which connects to either the primary or standby. SET TRANSACTION READ ONLY is always set on the ro_connector (which is desirable in case it connects to the primary).

It must be enabled via USE_RO_DATABASE_CONNECTOR; more information can be found in lib/DBDefs/Default.pm.

The prefer_ro_* methods were added to handle cases where use of the ro_connector depends on whether we're in an existing writable transaction. In that case we should use the existing writable connector for atomicity/consistency.

As a test, I've modified query_to_list and query_to_list_limited to use $c->prefer_ro_sql.

I've also made limited use of prefer_ro_sql in some background tasks (but not reports). More information about this can be found in MBS-13337.

Testing

I have not tested this with a separate READONLY connector yet.

@reosarevok
Copy link
Member

The idea seems sensible to me in theory at least - let's see what @yvanzo thinks. Perl tests did not run here for some reason, and we should probably have some actual written tests that use two separate connectors?

@mwiencek mwiencek added this to the Schema Change 2024 Q2 milestone May 3, 2024
@mwiencek mwiencek changed the base branch from master to schema-change-2024-q2 May 6, 2024 13:49
Copy link
Member

@reosarevok reosarevok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me and tests pass, so if need be we can probably merge for schema change as-is, but I would really like @yvanzo to also review since he is more experienced with this than I am.

@mwiencek
Copy link
Member Author

mwiencek commented May 7, 2024

I'm working on splitting the commits and improving the commit messages, plus will hopefully have a bit of time to add a test.

@mwiencek mwiencek force-pushed the ro-connector branch 3 times, most recently from ffa2ea0 to e8238ba Compare May 7, 2024 19:51
Copy link
Contributor

@yvanzo yvanzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the impressive amount of technical debt being paid off here!

I made minor comments only.

I couldn’t test it locally due to the xgettext-js error but it wouldn’t be a great coverage anyway. So adding a test would be great indeed.

As you said too, there is no rush with this pull request if you think that it requires much more time.

admin/InitDb.pl Show resolved Hide resolved
lib/MusicBrainz/Server/Context.pm Show resolved Hide resolved
lib/MusicBrainz/Server/Context.pm Outdated Show resolved Hide resolved
lib/MusicBrainz/Server/Data/Statistics.pm Show resolved Hide resolved
lib/MusicBrainz/Server/DatabaseConnectionFactory.pm Outdated Show resolved Hide resolved
lib/Sql.pm Show resolved Hide resolved
@mwiencek mwiencek changed the base branch from schema-change-2024-q2 to master May 17, 2024 17:46
Copy link
Contributor

@yvanzo yvanzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding any replication lag for logged-in editors is a wise change, thanks! ⛵

Should there be a separate ticket for querying the read-only database for website with logged-out users and for web service API? Or just generalizing the current ticket to something like “Use standby Postgres instances for asynchronous read-only queries”?

Edit: The ticket hasn’t been updated automatically because its key is missing from the pull request’s title.

The current `_disconnect` implementation will build `conn` even if it wasn't
created yet...just to immediately disconnect from it.
This may be used to check that a particular database name actually exists
without going through `get` (which has fallback logic for certain names).
This error should apply even if we are setting up a fresh connector.
This can be used to indicate that all transactions on the Sql handle should be
read-only.

`SET TRANSACTION READ ONLY` has been moved to `Sql::begin`, because it may be
called to initialize a transaction manually. (It's currently in
`_auto_transaction`; but note that `begin` is called right above the previous
location of the code, so it's ultimately being called in the same order as
before.)
This can be used to indicate that all transactions on the connector should be
read-only. It basically just forwards the flag to the `Sql` handle created for
the connector.
If the connection being requested is for the `READONLY` or `PROD_STANDBY`
database, pass the `read_only` flag to the connector.
This can be used to indicate that all transactions on all connectors created for
a database should be read-only.
A request may not initialize a connector (which is built lazily) if it doesn't
access the database -- or, when a read-only connector is added later, if it
only accesses one of the two connectors. A statement like
`$ctx->connector->disconnect` actually has the effect of initializing a
connector if it doesn't exist, only to immediately disconnect it. This can be
prevented by using a predicate to check if the attribute was set.
The `read_only` flag is now always set on the connector if the `READONLY`
database is accessed, so it's fine to use the `READWRITE` connection info if no
`READONLY` database is otherwise defined.

We've already added an `exists` method to `DatabaseConnectionFactory` (in a
previous commit) so that it can still be determined whether a `READONLY` key
was actually defined.
…nnector`

This is basically used to force (and verify) that the requested connector is
read-only.

Prior to this commit, the connector is only set to read-only if:

 * The `READONLY` or `PROD_STANDBY` database is requested.
 * The database has the `read_only` flag set (e.g., in `DBDefs`).

In a future commit, we'll add a read-only connector which'll default to calling
`get_connector` with `READONLY`. However, a different database than `READONLY`
could be used instead (and in fact this will happen if the `mb-set-database`
header is used); we want a way to force read-onlyness on the `get_connector`
call regardless of which key is requested.

If there's a previously-cached connector for the same key that's not read-only,
we throw an error. This shouldn't happen (e.g., we shouldn't request the
`READWRITE` key with the `read_only` flag set) unless a `fresh` connector is
also requested, in which case it's okay.
Like e918125 did for `SYSTEM_*`, this adds a
way to connect to an existing database key with the read-only flag set. For
example, accessing `READONLY_READWRITE` will create a new connector using all
of the `READWRITE` connection details, while also setting `read_only` on the
connector.

This will actually be useful in a future commit for overriding the read-only
connector's database when the `mb-set-database` header is used, without
clobbering the writable connector, which would otherwise be cached under the
same key name.
This logic will be reused in a later commit.
This adds an `ro_connector` to the server context which is intended to be used
to distribute read-only queries to our standby in production.

It must be enabled via `USE_RO_DATABASE_CONNECTOR`; more information can be
found in lib/DBDefs/Default.pm.

The `prefer_ro_*` methods were added to handle cases where use of the
`ro_connector` depends on whether we're in an existing writable transaction.
The main implication of using the standby instead of the primary here is
replication lag, which seems to hover between 50 and 400 ms at the time of
writing. However, note that `ro_connector` is not used if the user is
logged-in; so there should be no case where just-applied edits will not be
visible to them.
@mwiencek mwiencek changed the title Add a read-only database connector MBS-13337: Use standby Postgres instances for read-only queries May 21, 2024
@mwiencek
Copy link
Member Author

Should there be a separate ticket for querying the read-only database for website with logged-out users and for web service API? Or just generalizing the current ticket to something like “Use standby Postgres instances for asynchronous read-only queries”?

I updated the existing ticket, but mentioned that it will only be used for Data::Role::QueryToList and some background tasks to start with, if that sounds okay. (We may open additional tickets later to use the connector in other parts of the code.)

@mwiencek mwiencek merged commit 67217e7 into metabrainz:master May 24, 2024
2 checks passed
@mwiencek mwiencek deleted the ro-connector branch May 24, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants