Skip to content
/ sslr Public

A simple stupid logical replication tool for PostgreSQL using only a read-only connection. No extensions, no server-side setup.

License

Notifications You must be signed in to change notification settings

erkkah/sslr

Repository files navigation

Go Report Card Scc Count Badge Generic badge

SSLR - Simple Stupid Logical Replication

A simple stupid logical replication tool for PostgreSQL databases.

With so many existing solutions for replicating PostgreSQL databases, why did I create SSLR?

Well, maybe you are in a situation like mine:

  • you need to keep a read replica updated, but don't have the possibility to set up a real replica
  • you want to replicate a subset of tables from a database
  • you want to filter which rows to replicate
  • you want a minimal-config solution for getting a usable copy of your production data
  • you want to replicate without causing unnecessary load on the source database
  • you have somewhat loose referential consistency requirements

SSLR is a small tool that provides throttled, filtered replication using a regular read-only connection. No PostgreSQL server-side configuration needed.

Getting started

1. Download SSLR

Get pre-built binaries for Linux, Windows or OSX.

Or install using your local golang setup:

$ go get -u github.com/erkkah/sslr

2. Create a config

Create a JSON file called sslr.json containing your replication config.

A minimal config looks something like this:

{
    "source": "postgres://postgres:super-secret@localhost:2345/test?default_transaction_read_only=true",
    "target": "postgres://postgres:super-secret@localhost:2345/sslr",

    "tables": [
        "timestamps"
    ],
}

Use -cfg to specify configuration file.

3. Run the tool

Then just run the tool. It will create target tables as needed and do a full sync of all configured tables.

$ sslr
[INFO] 2020/10/08 14:31:43 Starting job with throttle at 75.00%
[INFO] 2020/10/08 14:31:43 Changes are synced in chunks of 10000
[INFO] 2020/10/08 14:31:43 Deletions are synced in chunks of 50000
[INFO] 2020/10/08 14:31:43 Connecting
[INFO] 2020/10/08 14:31:43 Validating tables
[INFO] 2020/10/08 14:31:43 Updating tables
[INFO] 2020/10/08 14:31:43 Fetching update range for table timestamps
[INFO] 2020/10/08 14:31:43 Syncing deletions for table timestamps
[INFO] 2020/10/08 14:31:43 Done
[INFO] 2020/10/08 14:31:43 0 row(s) updated in 160.154866ms

If run with -c, the tool will run continuously instead of running just one pass.

Replication options

SSLR is meant to be run regularly to poll the source database and update the target database. All operations are performed in chunks, and can be throttled to limit the load of the source database.

In many cases, the basic configuration described above works just fine. Depending on the source data structure, table size and optional filtering, you might need to tweak a couple of options to increase performance.

Chunking

Updates are fetched in chunks of updateChunkSize rows. Each source transaction, represented by the source row xmin value, has to be synced as a whole before being committed. The row-count of a transaction can be higher than the chunk size, it will just take more chunks to sync the whole transaction.

Deletes are scanned for using the initial chunk size of deleteChunkSize. If a chunk is found to contain changes, it is repeatedly split in half to track down the minDeleteChunkSize sized chunks containing the actual changes.

Throttling

The throttle level is set as the maximum allowed time spent in the source database as a percentage of total execution time.

It's pretty crude, but works well enough to make SSLR a nice database citizen.

If you can fully load the source database while running a sync job, set the throttle percentage value to 100 for unthrottled operation.

Job splitting

A replication job runs through all tables one at a time. Replicating many large tables will lead to long complete sync cycles. In those cases, it might make sense to split the sync job into several different SSLR configurations.

As long as the same table is not synced by more than one job, jobs can run in parallel.

Since scanning for deleted rows can be slow for large tables, you can split your replication jobs into update-only and delete-only jobs and run several updates before running deletes. Or - possibly running delete jobs only once per week, depending on your needs.

Logging

To get more feedback while tweaking options, use the LOG_LEVEL environment variable to set log level to debug.

Documented full configuration example

{
    "/* Connection URLS ":"*/",
    "source": "postgres://postgres:super-secret@localhost:2345/test?default_transaction_read_only=true",
    "target": "postgres://postgres:super-secret@localhost:2345/sslr",

    "/* List of tables to sync ":"*/",
    "tables": [
        "timestamps"
    ],

    "/* Everything below is optional ":"*/",

    "/* Filtered tables, with 'where' clause ":"*/",
    "/* If the 'where' clause depends on other tables, they need to be specified in the 'uses' list ":"*/",
    "filteredTables": {
        "strings": {
            "where": "exists (select count(*) from timestamps)",
            "uses": [
                "timestamps"
            ]
        }
    },

    "/* New or updated rows will be applied using this chunk size ":"*/",
    "updateChunkSize": 10000,

    "/* Source table will be scanned for deleted rows starting at this chunk size ":"*/",
    "deleteChunkSize": 50000,

    "/* Chunk size for when to stop divide and conquer while scanning for deleted rows ":"*/",
    "/* This is also the chunk size used for applying changes due to deletions ":"*/",
    "minDeleteChunkSize": 250,

    "/* Max source database utilization as a percentage of total execution time ":"*/",
    "throttlePercentage": 75,

    "/* Full table copy will be performed when target table has less than (fullCopyThreshold * source_rows) rows ":"*/",
    "fullCopyThreshold": 0.5,

    "/* Sync added and updated rows ":"*/",
    "syncUpdates": true,

    "/* Sync deleted rows ":"*/",
    "syncDeletes": true,

    "/* Perform full table copy instead of exiting when schema changes are detected ":"*/",
    "resyncOnSchemaChange": false,

    "/* Name of SSLR state table in the target database":"*/",
    "stateTable": "__sslr_state"
}

How does it work?

SSLR is based on per-table xmin tracking and primary key hashing. Source table updates are discovered by tracking last seen xmin. Source table deletes are discovered by comparing hashes on ranges of primary keys in a divide and conquer fashion.

The current state (highest synced xmin per table) is stored in the target database.

Algorithm

For each table:

  • Create destination

    • Extract create statements from source table

    • Create target table if needed using extracted statements

    • If target exists, compare current structure

      • Resync full table or abort if the structure has changed
  • Update

    • Get highest seen target xmin from state table
    • Get current highest source xmin
    • Pull needed changes and update in chunks
  • Delete

    • Generate list of hashes for primary keys in chunks

    • Compare hashes between source and target

    • Where they differ, split the list and repeat until minimum list size is found

    • Update the resulting ranges of ids in transactions:

      • Delete all in-range rows in target
      • Re-add all in-range rows from source to target

Known issues

  • Since replication is done table by table, there are moments of referential inconsistency in the target database
    • If you need consistent, valid data at all times, use real replication
  • As the target is meant for reading only, no triggers, constraints, et.c. except for primary keys are copied to the target
  • xmin wrapping is not handled
  • Full table copying is not throttled

About

A simple stupid logical replication tool for PostgreSQL using only a read-only connection. No extensions, no server-side setup.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages