timescale · VineethReddy02 · Jan 5, 2024 · JamesGuthrie · Jan 8, 2024 · JamesGuthrie
diff --git a/_partials/_migrate_dual_write_backfill_getting_help.md b/_partials/_migrate_dual_write_backfill_getting_help.md
@@ -2,6 +2,8 @@ import OpenSupportRequest from "versionContent/_partials/_migrate_open_support_r
 
 <Highlight type="tip">
 
+If you intend on migrating more than 400 GB, open a support request to ensure that enough disk is pre-provisioned on your Timescale instance.
+
 If you get stuck, you can get help by either opening a support request, or take
 your issue to the `#migration` channel in the [community slack](https://slack.timescale.com/),
 where the developers of this migration method are there to help.

diff --git a/_partials/_migrate_dual_write_step1.md b/_partials/_migrate_dual_write_step1.md
@@ -4,9 +4,4 @@ import OpenSupportRequest from "versionContent/_partials/_migrate_open_support_r
 
 [Create a database service in Timescale][create-service].
 
-If you intend on migrating more than 400&nbsp;GB, open a support request to
-ensure that enough disk is pre-provisioned on your Timescale instance.
-
-<OpenSupportRequest />
-
 [create-service]: /use-timescale/:currentVersion:/services/create-a-service/
diff --git a/migrate/index.md b/migrate/index.md
@@ -24,7 +24,7 @@ Below are the different migration options we offer. You can choose the one that
 |-------------------------------------------|----------------------------------|----------------------------------------------------------------------|--------------------------------|
 | [pg_dump and pg_restore][pg-dump-restore] | Postgres, TimescaleDB            | Downtime is okay                                                     | Requires some downtime         |
 | [Dual-write and backfill][dual-write]     | Postgres, TimescaleDB and others | Append-only data, heavy insert workload (~20,000 inserts per second) | Optimized for minimal downtime |
-| [Live migration][live-migration]          | Postgres                         | Simplified end-to-end migration with almost zero downtime            | Optimized for minimal downtime |
+| [Live migration][live-migration]          | Postgres, TimescaleDB            | Simplified end-to-end migration with almost zero downtime            | Optimized for minimal downtime |
 
 If you are using PostgreSQL or TimescaleDB and can afford to take your
 application offline for a few hours, the simplest option is to migrate data

diff --git a/migrate/live-migration/index.md b/migrate/live-migration/index.md
@@ -9,10 +9,9 @@ import SourceTargetNote from "versionContent/_partials/_migrate_source_target_no
 
 # Live migration
 
-Live migration is a migration strategy to move a large amount of data
-(100&nbsp;GB-10&nbsp;TB+) with low downtime (on the order of few minutes). It requires
-more steps to execute than a migration with downtime using [pg_dump/restore][pg-dump-and-restore],
-but supports more use-cases and has less requirements than the [dual-write and backfill] method.
+Live migration is a strategy used to move a large amount of data (100&nbsp;GB-10&nbsp;TB+) with minimal downtime (typically a few minutes). It involves copying existing data from the source to the target and supports change data capture to stream ongoing changes from the source during the migration process.
-Live migration is a strategy used to move a large amount of data (100&nbsp;GB-10&nbsp;TB+) with minimal downtime (typically a few minutes). It involves copying existing data from the source to the target and supports change data capture to stream ongoing changes from the source during the migration process.
+Live migration is a strategy to move a large amount of data
+(100&nbsp;GB-10&nbsp;TB+) with minimal downtime (typically a few minutes). It
+achieves low downtime by simultaneously 1) copying existing data from the
+source database to the target database and 2) recording ongoing changes which
+are made on the source. When the initial data copy completes, it continuously
+applies the recorded transactions to the target database until the target
+database is fully caught up with the source database. At this point the
+application's database connection is switched to the target database (which may
+result in a short downtime), and the migration is complete.
-Live migration is a strategy used to move a large amount of data (100&nbsp;GB-10&nbsp;TB+) with minimal downtime (typically a few minutes). It involves copying existing data from the source to the target and supports change data capture to stream ongoing changes from the source during the migration process.
+Live migration is a strategy to move a large amount of data
+(100&nbsp;GB-10&nbsp;TB+) with minimal downtime (typically a few minutes). It
+achieves low downtime by simultaneously 1) copying existing data from the
+source database to the target database and 2) recording ongoing changes which
+are made on the source. When the initial data copy completes, it continuously
+applies the recorded transactions to the target database until the target
+database is fully caught up with the source database. At this point the
+application's database connection is switched to the target database (which may
+result in a short downtime), and the migration is complete.
+
+In contrast, [pg_dump/restore][pg-dump-and-restore] only supports copying the database from the source to the target without capturing ongoing changes, which results in downtime. On the other hand, the [dual-write and backfill] method requires setting up dual write in the application logic. This method is recommended only for append-only workloads as it does not support updates and deletes during migration.
-In contrast, [pg_dump/restore][pg-dump-and-restore] only supports copying the database from the source to the target without capturing ongoing changes, which results in downtime. On the other hand, the [dual-write and backfill] method requires setting up dual write in the application logic. This method is recommended only for append-only workloads as it does not support updates and deletes during migration.
+In contrast, [pg_dump/restore][pg-dump-and-restore] only supports copying data
+from the source to the target without recording ongoing changes, so
+applications which are writing must be stopped for the duration of the
+migration. On the other hand, the [dual-write and backfill] method also
+provides a way to migrate with low downtime, but requires modifying your
+application to write to two databases simultaneously, and only works with
+append-only workloads as it does not support updates and deletes during
+migration.
-In contrast, [pg_dump/restore][pg-dump-and-restore] only supports copying the database from the source to the target without capturing ongoing changes, which results in downtime. On the other hand, the [dual-write and backfill] method requires setting up dual write in the application logic. This method is recommended only for append-only workloads as it does not support updates and deletes during migration.
+In contrast, [pg_dump/restore][pg-dump-and-restore] only supports copying data
+from the source to the target without recording ongoing changes, so
+applications which are writing must be stopped for the duration of the
+migration. On the other hand, the [dual-write and backfill] method also
+provides a way to migrate with low downtime, but requires modifying your
+application to write to two databases simultaneously, and only works with
+append-only workloads as it does not support updates and deletes during
+migration.
 
 <SourceTargetNote />
 
@@ -23,8 +22,8 @@ Roughly, it consists of four steps:
 
 1. Prepare and create replication slot in source database.
 2. Copy schema from source to target, optionally enabling hypertables.
-3. Copy data from source to target while capturing changes.
-4. Apply captured changes from source to target.
+3. Copy data from source to target while capturing ongoing changes from source.
+4. Apply captured ongoing changes to target.
 
 Live migration works well when:
 - Large, busy tables have primary keys, or don't have many `UPDATE` or
@@ -38,7 +37,7 @@ For more information, refer to the step-by-step migration guide:
 - [Live migration from PostgreSQL][from-postgres]
 - [Live migration from TimescaleDB][from-timescaledb]
 
-If you want to manually migrate data from PostgreSQL, refer to
+If you want to have more control over migration and prefer to manually migrate data from PostgreSQL, refer to 
-If you want to have more control over migration and prefer to manually migrate data from PostgreSQL, refer to 
+If you want to have more control over the migration and prefer to manually migrate data from PostgreSQL, refer to 
-If you want to have more control over migration and prefer to manually migrate data from PostgreSQL, refer to 
+If you want to have more control over the migration and prefer to manually migrate data from PostgreSQL, refer to 
 [Live migration from PostgreSQL manually][live-migration-manual].
 
 If you are migrating from AWS RDS to Timescale, you can refer to [this][live-migration-playbook] playbook

diff --git a/migrate/live-migration/live-migration-from-postgres.md b/migrate/live-migration/live-migration-from-postgres.md
@@ -14,9 +14,9 @@ import DumpPreDataSourceSchema from "versionContent/_partials/_migrate_pre_data_
 import DumpPostDataSourceSchema from "versionContent/_partials/_migrate_post_data_dump_source_schema.mdx";
 import LiveMigrationStep2 from "versionContent/_partials/_migrate_live_migration_step2.mdx";
 
-# Live migration from PostgreSQL database with pgcopydb
+# Live migration from PostgreSQL database
 
-This document provides detailed instructions to migrate data from your
+This document provides instructions to migrate data from your
 PostgreSQL database to a Timescale instance with minimal downtime (on the order
 of a few minutes) of your production applications, using the [live migration]
 strategy. To simplify the migration, we provide you with a docker image
@@ -26,10 +26,8 @@ migration.
 You should provision a dedicated instance to run the migration steps from.
 Ideally an AWS EC2 instance that's in the same region as the Timescale target
 service. For an ingestion load of 10,000 transactions/s, and assuming that the
-historical data copy takes 2 days, we recommend 4 CPUs with 4 to 8 GiB of RAM
-and 1.2 TiB of storage.
-
-<SourceTargetNote />
+historical data of size 2 TB, we recommend 4 CPUs with 4 to 8 GiB of RAM
+and 1.2 TiB of storage, this approximates takes 24 hours to complete the migration.
 
 In detail, the migration process consists of the following steps:
 
@@ -46,7 +44,7 @@ In detail, the migration process consists of the following steps:
 
 <LiveMigrationStep2 />
 
-Next, you need to ensure that your source tables and hypertables have either a primary key
+Next, you need to ensure that your source tables have either a primary key
 or `REPLICA IDENTITY` set. This is important as it is a requirement for replicating `DELETE` and
 `UPDATE` operations. Replica identity assists the replication process in identifying the rows
 being modified. It defaults to using the table's primary key.
@@ -120,19 +118,14 @@ it will start `ANALYZE` on the target database. This updates statistics in the
 target database, which is necessary for optimal querying performance in the
 target database. Wait for `ANALYZE` to complete.
 
-<Highlight type="important">
-Application downtime begins here.
-</Highlight>
+## 4. Validate the data in target database and use it as new primary
 
-Once the lag between the databases is below 30 megabytes, and you're ready to
-take your applications offline, stop all applications which are writing to the
-source database. This is the downtime phase and will last until you have
-completed the validation step (4). Be sure to go through the validation step
-before you enter the downtime phase to keep the overall downtime minimal.
+Once the lag between the databases is below 130 megabytes, we recommend performing data integrity checks. There are two ways to do this:
-Once the lag between the databases is below 130 megabytes, we recommend performing data integrity checks. There are two ways to do this:
+Once the lag between the databases is below 30 megabytes, we recommend performing data integrity checks. There are two ways to do this:
-Once the lag between the databases is below 130 megabytes, we recommend performing data integrity checks. There are two ways to do this:
+Once the lag between the databases is below 30 megabytes, we recommend performing data integrity checks. There are two ways to do this:
 
-Stopping writes to the source database allows the live migration process to
-finish replicating data to the target database. This will be evident when the
-replication lag reduces to 0 megabytes.
+1. With downtime: Stop database operations from your application, which will result in downtime. This allows the live migration to catch up on the lag between the source and target databases, enabling the validation checks to be performed. The downtime will last until the lag is eliminated and the data integrity checks are completed.
+2. Without downtime: Since the difference between the source and target databases is less than 130 MB, you can perform data integrity checks, excluding the latest data that is still being written. This approach does not require taking your application down.
+
+Now that the data integrity checks are complete, it's time to switch your target database to become the primary one. If you have selected option 2 for data integrity checks, stop writing to the source database and immediately start writing to the target database from the application. This will minimize application downtime to as low as application restart. It allows the live migration process to complete replicating data to the target database, as the source will no longer receive any new transactions. You will know the process is complete when the replication lag reduces to 0 megabytes. If you have chosen option 1 for data integrity checks, start your application to write data to the target database.
 
 Once the replication lag is 0, wait for a few minutes and then provide the
 signal to proceed by pressing key `c`.
@@ -150,19 +143,5 @@ message if all the mentioned steps were successful.
 Migration successfully completed
 ```
 
-## 4. Validate the data in target database and use it as new primary
-
-Now that all data has been migrated, the contents of both databases should
-be the same. How exactly this should best be validated is dependent on
-your application. You could compare the number of rows or an aggregate of
-columns to validate that the target database matches with the source.
-
-<Highlight type="important">
-Application downtime ends here.
-</Highlight>
-
-Once you are confident with the data validation, the final step is to configure
-your applications to use the target database.
-
 [Hypertable docs]: /use-timescale/:currentVersion:/hypertables/
 [live migration]: https://docs.timescale.com/migrate/latest/live-migration/
diff --git a/migrate/live-migration/live-migration-from-timescaledb.md b/migrate/live-migration/live-migration-from-timescaledb.md
@@ -14,7 +14,7 @@ import DumpPreDataSourceSchema from "versionContent/_partials/_migrate_pre_data_
 import DumpPostDataSourceSchema from "versionContent/_partials/_migrate_post_data_dump_source_schema.mdx";
 import LiveMigrationStep2 from "versionContent/_partials/_migrate_live_migration_step2.mdx";
 
-# Live migration from TimescaleDB database with pgcopydb
+# Live migration from TimescaleDB database
 
 This document provides detailed instructions to migrate data from your
 TimescaleDB database (self-hosted or on [Managed Service for TimescaleDB]) to a
@@ -26,7 +26,8 @@ scripts that you need to perform the live migration.
 You should provision a dedicated instance to run the migration steps from.
 Ideally an AWS EC2 instance that's in the same region as the Timescale target service.
 For an ingestion load of 10,000 transactions/s, and assuming that the historical
-data copy takes 2 days, we recommend 4 CPUs with 4 to 8 GiB of RAM and 1.2 TiB of storage.
+data copy takes 2 days, we recommend 4 CPUs with 4 to 8 GiB of RAM and 1.5x of storage 
+in the ec2 machine.
 
 <SourceTargetNote />
 
@@ -114,49 +115,30 @@ start `ANALYZE` on the target database. This updates statistics in the target
 which is necessary for optimal querying performance in the target database. Wait for
 `ANALYZE` to complete.
 
-<Highlight type="important">
-Application downtime begins here.
-</Highlight>
+## 4. Validate the data in target database and use it as new primary
 
-Once the lag between the databases is below 30 megabytes, and you're ready to
-take your applications offline, stop all applications which are writing to the
-source database. This is the downtime phase and will last until you have
-completed the validation step (4). Be sure to go through the validation step
-(4) before you enter the downtime phase to keep the overall downtime minimal.
+Once the lag between the databases is below 130 megabytes, we recommend performing data integrity checks. There are two ways to do this:
 
-Stopping writes to the source database allows the live migration process to
-finish replicating data to the target database. This will be evident when the
-replication lag reduces to 0 megabytes.
+1. With downtime: Stop database operations from your application, which will result in downtime. This allows the live migration to catch up on the lag between the source and target databases, enabling the validation checks to be performed. The downtime will last until the lag is eliminated and the data integrity checks are completed.
+2. Without downtime: Since the difference between the source and target databases is less than 130 MB, you can perform data integrity checks, excluding the latest data that is still being written. This approach does not require taking your application down.
+
+Now that the data integrity checks are complete, it's time to switch your target database to become the primary one. If you have selected option 2 for data integrity checks, stop writing to the source database and immediately start writing to the target database from the application. This will minimize application downtime to as low as application restart. It allows the live migration process to complete replicating data to the target database, as the source will no longer receive any new transactions. You will know the process is complete when the replication lag reduces to 0 megabytes. If you have chosen option 1 for data integrity checks, start your application to write data to the target database.
 
 Once the replication lag is 0, wait for a few minutes and then provide the
 signal to proceed by pressing key `c`.
 
 ```sh
-[WATCH] Source DB - Target DB => 0MB. Press "c" (and ENTER) to stop live-replay
+[WATCH] Source DB - Target DB => 0MB. Press "c" (and ENTER) to proceed
 Syncing last LSN in Source DB to Target DB ...
 ```
 
-The live migration image will continue the remaining work under live replay,
-copy TimescaleDB metadata, sequences, and run policies. You should see the
-following message if all the mentioned steps were successful.
+The live migration image will continue the remaining work that includes
+migrating sequences and cleaning up resources. You should see the following
+message if all the mentioned steps were successful.
 
 ```sh
 Migration successfully completed
 ```
 
-## 4. Validate the data in target database and use it as new primary
-
-Now that all data has been migrated, the contents of both databases should be the
-same. How exactly this should best be validated is dependent on your application.
-You could compare the number of rows or an aggregate of columns to validate that
-the target database matches with the source.
-
-<Highlight type="important">
-Application downtime ends here.
-</Highlight>
-
-Once you are confident with the data validation, the final step is to configure
-your applications to use the target database.
-
 [Managed Service for TimescaleDB]: https://www.timescale.com/mst-signup/
 [live migration]: https://docs.timescale.com/migrate/latest/live-migration/
diff --git a/migrate/page-index/page-index.js b/migrate/page-index/page-index.js
@@ -36,19 +36,19 @@ module.exports = [
         excerpt: "Migrate a large database with low downtime",
         children: [
           {
-            title: "Live migration from PostgreSQL",
+            title: "From PostgreSQL",
             href: "live-migration-from-postgres",
             excerpt:
                 "Migrate from PostgreSQL using live migration",
           },
           {
-            title: "Live migration from TimescaleDB",
+            title: "From TimescaleDB",
             href: "live-migration-from-timescaledb",
             excerpt:
                 "Migrate from TimescaleDB using live migration",
           },
           {
-            title: "(Advanced) Live migration from PostgreSQL manually",
+            title: "(Advanced) From PostgreSQL manually",
             href: "live-migration-from-postgres-manually",
             excerpt:
                 "Migrate from TimescaleDB using live migration manually",

diff --git a/migrate/playbooks/rds-timescale-live-migration.md b/migrate/playbooks/rds-timescale-live-migration.md
@@ -346,6 +346,7 @@ ALTER TABLE {table_name} REPLICA IDENTITY FULL;
 ```
 
 ## 3. Set up a replication slot and snapshot
+
 Once you're sure that the tables which will be affected by `UPDATE` and `DELETE`
 queries have `REPLICA IDENTITY` set, you will need to create a replication slot.
 
@@ -372,32 +373,12 @@ Additionally, `follow` command exports a snapshot ID to `/tmp/pgcopydb/snapsho
 This ID can be utilized to migrate data that was in the database before the replication
 slot was created.
 
-## 4. Migrate roles and schema from source to target
-Before applying DML operations from the replication slot, the schema and data from
-the source database need to be migrated.
-The larger the size of the source database, the more time it takes to perform the
-initial migration, and the longer the buffered files need to be stored.
-
-### 4.a Migrate database roles from source database
-<LiveMigrationRoles />
-
-### 4.b Dump the database schema from the source database
-<DumpPreDataSourceSchema />
-
-### 4.c Load the roles and schema into the target database
-```sh
-psql -X -d "$TARGET" \
-  -v ON_ERROR_STOP=1 \
-  --echo-errors \
-  -f roles.sql \
-  -f pre-data-dump.sql
-```
+## 4. Perform Live Migration
 
-## 5. Perform "live migration"
 The remaining steps for migrating data from a RDS Postgres instance to Timescale
 with low-downtime are the same as the ones mentioned in "Live migration"
-documentation from [Step 5] onwards. You should follow the mentioned steps
+documentation from [Step 3] onwards. You should follow the mentioned steps
 to successfully complete the migration process.
 
 [live migration]: /migrate/:currentVersion:/live-migration/live-migration-from-postgres/
-[Step 5]: /migrate/:currentVersion:/live-migration/live-migration-from-postgres/#5-enable-hypertables
+[Step 3]: /migrate/:currentVersion:/live-migration/live-migration-from-postgres/#3-run-the-live-migration-docker-image