Skip to content

A utility to help with exporting, importing and merging CSV data into a Postgresql database.

License

Notifications You must be signed in to change notification settings

samuller/pgmerge

Repository files navigation

pgmerge - a PostgreSQL data import and merge utility

Build Status PyPI Version Code Coverage Checked with mypy

This utility's main purpose is to manage a set of CSV files that correspond with tables in a PostgreSQL database:

  • Each of these CSV file's data can then be merged into their corresponding table (see section below).
  • The database schema is analysed to determine dependencies among tables (due to foreign keys), and CSV files are then imported in the correct order such that the data/tables that they might depend on have been imported first.
  • pgmerge can then also export data in the same format expected for import.

These features allow you to move data between databases with the same schema to keep them up to date and in sync, although it does not cover handling deleted data.

Import merging

Merging CSVs into a table means that the following process will occur (also called an upsert operation):

  • Rows whose primary key don't yet exist in the table will be imported.
  • When the primary key already exists, row values will be updated.
  • Rows that are missing or unchanged will be ignored.

CLI arguments

$ pgmerge --help
Usage: pgmerge [OPTIONS] COMMAND [ARGS]...

Merges data in CSV files into a Postgresql database.

Options:
--version  Show the version and exit.
--help     Show this message and exit.

Commands:
export  Export each table to a CSV file.
import  Import/merge each CSV file into a table.
inspect  Inspect database schema in various ways.

Import

$ pgmerge import --help
Usage: pgmerge import [OPTIONS] DIRECTORY [TABLES]...

Import/merge each CSV file into a table.

All CSV files need the same name as their matching table and have to be located
in the given directory. If one or more tables are specified then only they will
be used, otherwise all tables found will be selected.

Options:
-d, --dbname TEXT               Database name to connect to.  [required]
-h, --host TEXT                 Database server host or socket directory.
                                [default: localhost]
-p, --port TEXT                 Database server port.  [default: 5432]
-U, --username TEXT             Database user name.  [default: postgres]
-s, --schema TEXT               Database schema to use.  [default: public]
-w, --no-password               Never prompt for password (e.g. peer
                                authentication).
-W, --password TEXT             Database password (default is to prompt for
                                password or read config).
-L, --uri TEXT                  Connection URI can be used instead of specifying
                                parameters separately (also sets --no-password).
-f, --ignore-cycles             Don't stop import when cycles are detected in
                                schema (will still fail if there are cycles in
                                data)
-F, --disable-foreign-keys      Disable foreign key constraint checking during
                                import (necessary if you have cycles, but requires
                                superuser rights).
-c, --config PATH               Config file for customizing how tables are
                                imported/exported.
-i, --include-dependent-tables  When selecting specific tables, also include all
                                tables on which they depend due to foreign key
                                constraints.
--help                          Show this message and exit.

Installation

WARNING: the reliability of this utility is not guaranteed and loss or corruption of data is always a possibility.

Install from PyPI

With Python 3 installed on your system, you can run:

pip install pgmerge

To test that installation worked, run:

pgmerge --help

and you can uninstall at any time with:

pip uninstall pgmerge

Install from Github

To install the newest code directly from Github:

pip install git+https://github.com/samuller/pgmerge

Issues

If you have trouble installing and you're running a Debian-based Linux that uses Python 2 as its system default, then you might need to run:

sudo apt install libpq-dev python3-pip python3-setuptools
sudo -H pip3 install pgmerge

About

A utility to help with exporting, importing and merging CSV data into a Postgresql database.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published