Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to tell pglogical to start at a specific lsn in the wal? #460

Open
ApproximateIdentity opened this issue Feb 11, 2024 · 1 comment

Comments

@ApproximateIdentity
Copy link

ApproximateIdentity commented Feb 11, 2024

I'll describe a simplified version of the process to be concrete.

  1. Create a primary server as initdb -D ./primary, set it's wal_level to logical and start it up.
  2. Create a bunch of tables, insert data, etc.
  3. Reserve a replication slot in some way (the main purpose being to make sure the wal is not garbage collected).
  4. Stop the server and copy the directory like cp -r ./primary ./secondary. Start up both servers (on different ports).
  5. Assume that data has kept writing to the primary after the replication slot was reserved so that the primary has data that the secondary does not have, but also assume the wal has this data preserved. Assume that nothing else like schemas have changes.

Given that setup, I would like to get the secondary caught up with the primary. The secondary is a straight copy so by looking at how far it has gotten in the wal, I should be able to see where in the primary's wal to start replicating. Since the first replication slot has kept the wal from being garbage collected, the data really should be there. So then my question becomes, is it possible to start the replication from that location?

I feel like information theoretically this shouldn't be a problem, but obviously postgres doesn't need to fit my idealized mental model. Is something like this possible with pglogical in some way?

Thanks for any help!

@psi
Copy link

psi commented Apr 18, 2024

My team was attempting to make use of pglogical to replicate between Aurora RDS clusters on AWS to facilitate an upgrade from pg11 to pg12 with minimal downtime and we've hit this same wall.

We are taking an RDS snapshot (which is a storage snapshot) of the pg11 cluster, restoring it to a new cluster, upgrading the new cluster to pg12, then establishing replication. Replication is working for us from the time the subscription is created forward, but we haven't found any method to specify starting replication at the point in time LSN from when the snapshot was taken, and for our case, allowing pglogical to handle the data synchronization doesn't make sense from a time perspective.

Would also appreciate any insight here, as NOT being able to specify this really caught me off guard. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants