Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running CREATE INDEX CONCURRENTLY on subscriber can break subscription (subscriber: postgres 15.6 + pglogical 2.4.4) - error message in logs: could not read block 0 in file ""base/[DB_OID]/[IDX_RELATION_FILENODE]"": read only 0 of 8192 bytes #469

Open
aristocrates opened this issue Apr 9, 2024 · 1 comment

Comments

@aristocrates
Copy link

I don't have a minimal repro for this right now, but the symptom I have observed on a subscriber that I am syncing for an upgrade is:

  • (context) I have modified the subscriber schema to just have primary key indexes and no constraints for the initial data sync - plan is to create the other indexes concurrently on the subscriber once all tables have synced, and then after index creation to create all foreign key constraints (with NOT VALID followed by VALIDATE CONSTRAINT)
  • But, on some indexes in the desired final schema, running CREATE UNIQUE INDEX CONCURRENTLY [options]; on the subscriber results in the subscription going status: down with a log line similar to the following in the CSV log:
2024-04-02 15:16:17.297 PDT,,"$DBNAME",4712,,6608afb3.1268,3738,"",2024-03-30 17:34:59 PDT,7/184779565,0,ERROR,XX001,"could not read block 0 in file ""base/16400/105657687"": read only 0 of 8192 bytes",,,,,"apply INSERT from remote relation $TABLE_NAME in commit before 45415/83927478, xid 197646281 committed at 1999-12-31 16:12:45.411377-08 (action #2) from node replorigin 1",,,,"pglogical apply 16400:988484226","pglogical apply 16400:988484226",,0

In this particular example, 105657687 is indeed the result of calling pg_relation_filenode($INDEX_NAME) on the currently-INVALID $INDEX_NAME that is being created concurrently at the time the subscription goes down:

$DBNAME=# select pg_relation_filenode($INDEX_NAME); -- select oid from pg_database where datname = $DBNAME; yields 16400
 pg_relation_filenode
----------------------
            105657687
(1 row)

postgres@[host]:/path/to/data/base/16400$ ls -alh 105657687
-rw------- 1 postgres postgres 0 Apr  2 15:16 105657687

The subscriber is on postgres 15.6 and pglogical 2.4.4

A workaround that seems to work fine is to just create the index non-concurrently, in which case pglogical apply on the subscriber blocks on obtaining a RowExclusiveLock until the create index finishes, and the replication slot stays put on the provider.

Based on #344 (comment) (and other guides around using pglogical on DBs with large tables) the expected behavior is for CREATE INDEX CONCURRENTLY to work fine on the subscriber:

You can always create indexes after the initial load. If you are concerned about index build that locks out writes, you can always use CONCURRENTLY.

@aristocrates
Copy link
Author

I have not noticed this on all indexes, just on a handful. I would be surprised if this happened every time an update, insert, or delete replicated over while the concurrent index creation was happening, since I do not recall running into this on other DB upgrades between the same postgres versions, but the behavior replicated when I retried index creation on these particular indexes, even after re-initializing the subscriber from scratch on a new machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant