pgcp
is a simple CLI tool to copy specific column data from a source table with different schemas/columns.
The destination table won't be created before copying, you must create it beforehand using psql
or similar. Also, this tool requires a reasonable understanding of the structure of your source and destination databases to ensure accurate mapping.
Data will be converted to string and then inserted building a simple INSERT
sql query, so your table/column destination schema should (ideally) match the source one beforehand.
pgcp
uses environment variables for source PG_FROM_DATABASE
and destination PG_TO_DATABASE
database connections.
You can also use args --from
and --to
to accomplish the same.
export PG_FROM_DATABASE=postgres://user:password@localhost:5432/fromdatabase
export PG_TO_DATABASE=postgres://user:password@other_db:5432/todatabase
The mapping of source to destination tables and columns is done using a simple source:destination
syntax.
pgcp --table collections:collections \
--col id:id \
--col id/metadata_jsons.name:name \
--col project_id:project_id --col created_at:timestamp \
--update col_name=from_val:dest_col=replace_with_value
--static acolumn=some_data
In the above example, collections table from the source database is mapped to the collections table in the destination database. The id
column in the source will be copied to the id
column in the destination, same for project_id
. created_at
will be copied into timestamp
column in the destination DB.
A more 'complex' column mapping is demonstrated in the below example, with id/metadata_jsons.name:/name
which shows how to map a column from a joined table in the source database, where we get the name
of the collection based on the id
from the table we have as source, in this case the collections
table.
The --static acolumn=some_data
option allows you to specify a static column and value that will be added to every row in the destination database.
The --update
argument can be used to do a find/replace of a value before inserting it in the destination table.
This results in the following SQL queries:
Read from source:
SELECT collections.id, metadata_jsons.name AS metadata_jsons_id_name,
project_id, created_at, 'some_data' AS acolumn FROM collections
INNER JOIN metadata_jsons ON collections.id = metadata_jsons.id
Write to dest:
INSERT INTO collections (id, name, project_id, timestamp, acolumn)
VALUES ('2f7f1a90-f6a3-4f94-83b6-77bc9e289c2d', 'Demo',
'da6cf455-971c-46a4-9ed7-fe0e9b5f5548', '2023-03-07 16:57:27.762190 UTC', 'some_data')
Install Rust by following the instructions in the getting started page
Clone and compile/install.
## Clone the repository
git clone https://github.com/mpwsh/pgcp
## cd into the folder and install with cargo
cd pgcp
cargo install --path .
## or just build and run from ./target/debug
cargo build
./target/debug/pgcp
Use at your own risk. Error handling is practically non-existent, but postgres will complain if somethings wrong.
See LICENSE