Boostrapping multi-terrabyte cluster too slow with pg_dump #4221

rochecompaan · 2023-09-05T10:03:31Z

rochecompaan
Sep 5, 2023

I set up a temporary Postgres deployment to bootstrap a new cluster according to #2638. The total size of the database is 6.2 terabytes. After roughly one day, only 300 GB was restored. For comparison, the wal-g restore took about 6 hours, and this was only possible because I could increase the concurrency to 100. For CPNG, the spec doesn't currently allow increasing the number of pg_dump jobs, e.g., pg_dump -j njobs.

I suspect that even increasing the number of pg_dump jobs won't help me to restore a database this size in a reasonable time (hours instead of days), but I would be happy to try if such an option becomes available.

I used an unorthodox workaround that might be useful to someone else trying to bootstrap a new cluster from a big database:

Create a temporary Postgres deployment using the base backup restored using wal-g.
Create a single instance CNPG cluster with node affinity set to the same node as the temp Postgres instance.
Use imperative hibernation to hibernate the CNPG cluster.
Move the base, global, and all the pg_* dirs from the Postgres persistent volume to the CNPG instance persistent volume and fix file permissions. Keep all the config files as is. I moved the files directly on the node filesystem while not in use by any pods.
Switch hibernation off on the CNPG cluster to start the cluster up and scale to 3 instances.

It would be great if the docs could address bootstrapping a new cluster from a physical backup in the future, but allowing one to increase the number of pg_dump jobs would already be a big help.

gbartolini · 2023-09-05T15:02:49Z

gbartolini
Sep 5, 2023
Maintainer

Hi @rochecompaan,

Just one clarification. First, I am not very familiar with WAL-G so I apologize if I give you wrong information. Per my understanding, WAL-G is a physical backup and recovery technology, similar to Barman (which I started many years ago) or PgBackrest.

You are mentioning pg_dump, which is a logical backup tool. They cannot be mixed.

Supporting what you are asking (restore from any phisical backup) is not an easy task and unfortunately it is not guaranteed to work under several circumstances, due to potential (and unfixable on our end) different assumptions on the way Postgres is used. For example, think about the name of the predefined user for streaming replication, or some fixed settings that the operator requires and might not be compatible with the origin system (be it bare metal, VM or a different operator).

But you are not alone in this use case. Last week I was at Google Next and talked to several people that wanted to move their database from outside Kubernetes (including solutions like RDS) into Kubernetes on Postgres managed by CloudNativePG, without any cutover time.

I personally think that the road of physical backup and recovery is long and winding. As time passes, I more and more believe that the approach to follow is the one I recently summed up in a blog article: https://www.enterprisedb.com/blog/current-state-major-postgresql-upgrades-cloudnativepg-kubernetes

Please let me know if that resonates with you.

Cheers,
Gabriele

0 replies

rochecompaan · 2023-09-05T18:34:59Z

rochecompaan
Sep 5, 2023
Author

@gbartolini, thanks for your feedback!

Yes, WAL-G is a physical backup tool, just like Barman, but let me clarify how pg_dump comes into play.

I only used WAL-G to set up a temporary Postgres instance running inside the Kubernetes cluster. I created a Kubernetes Job to use WAL-G to restore to a persistent volume. Once the job was completed, I mounted the persistent volume inside the Postgres instance. At this point I had a running instance that one can use to bootstrap a new CNPG cluster using externalClusters as documented here: https://cloudnative-pg.io/documentation/1.20/database_import/#the-microservice-type

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: cluster-microservice
spec:
  instances: 3

  bootstrap:
    initdb:
      import:
        type: microservice
        databases:
          - angus
        source:
          externalCluster: cluster-pg96
        #postImportApplicationSQL:
        #- |
        #  INSERT YOUR SQL QUERIES HERE
  storage:
    size: 1Gi
  externalClusters:
    - name: cluster-pg96
      connectionParameters:
        # Use the correct IP or host name for the source database
        host: pg96.local
        user: postgres
        dbname: postgres
      password:
        name: cluster-pg96-superuser
        key: password

As per the docs:

export of the selected database (in initdb.import.databases) using pg_dump -Fc
import of the database using pg_restore --no-acl --no-owner into the initdb.database

So, I'm referring to how the operator uses pg_dump to bootstrap the cluster, not my direct use. I had a look at the options used, and it doesn't support running multiple processes: https://github.com/cloudnative-pg/cloudnative-pg/blob/main/pkg/management/postgres/logicalimport/database.go#L102

I'm heading over to read your article next. I just wanted to clarify my use of pg_dump first.

0 replies

rochecompaan · 2023-09-06T05:48:29Z

rochecompaan
Sep 6, 2023
Author

@gbartolini great article! Using Postgres publication and subscription seems very promising as a way to migrate an existing database into CNPG! I am a little concerned that sequences are not migrated but I'm sure it can be managed by treading carefullly.

That said, I don't think that this method will help bootstrap a big database in a reasonable time. I imagine restoring a full base backup from a primary and replaying WAL files afterwards will still be significantly faster than any alternative, purely because restoring the base backup can be sped up by running concurrent processes.

0 replies

gbartolini · 2023-09-13T11:46:59Z

gbartolini
Sep 13, 2023
Maintainer

If you are not interested in the migration part, but just start a replica from an existing cluster, have you looked into the new volume snapshot feature that we will introduce in version 1.21? You can actually create a replica cluster from an existing volume snapshot that you make available in another region, and bootstrap in a shorter time by fetching WAL files from an object store. See commit c2ca044.

If you are interested, you can already test this feature by installing the latest available snapshot.

0 replies

sxd · 2024-04-06T10:16:19Z

sxd
Apr 6, 2024
Maintainer

This is a discussion more than an issue now, turning into a discusison

0 replies

gbartolini · 2024-04-08T06:56:10Z

gbartolini
Apr 8, 2024
Maintainer

Good idea. Also, I suggest that use this article now as a possibility for this use case: https://gabrielebartolini.it/articles/2024/03/cloudnativepg-recipe-5-how-to-migrate-your-postgresql-database-in-kubernetes-with-~0-downtime-from-anywhere/

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boostrapping multi-terrabyte cluster too slow with pg_dump #4221

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Boostrapping multi-terrabyte cluster too slow with pg_dump #4221

rochecompaan Sep 5, 2023

Replies: 6 comments

gbartolini Sep 5, 2023 Maintainer

rochecompaan Sep 5, 2023 Author

rochecompaan Sep 6, 2023 Author

gbartolini Sep 13, 2023 Maintainer

sxd Apr 6, 2024 Maintainer

gbartolini Apr 8, 2024 Maintainer

rochecompaan
Sep 5, 2023

gbartolini
Sep 5, 2023
Maintainer

rochecompaan
Sep 5, 2023
Author

rochecompaan
Sep 6, 2023
Author

gbartolini
Sep 13, 2023
Maintainer

sxd
Apr 6, 2024
Maintainer

gbartolini
Apr 8, 2024
Maintainer