Skip to content

jgdavey/clj-pgcopy

Repository files navigation

clj-pgcopy

Clojars

Import data into postgres quickly, implemented using postgresql’s COPY in binary format.

Because sometimes jdbc/insert! and friends aren’t fast enough.

This library uses type-based dispatch for determining the correct postgres binary format. See the mapping section for more info.

Usage

Add the library to your project. See description on Clojars for more information.

Given a table and some data:

(require '[clj-pgcopy.core :as pgcopy])

(jdbc/with-db-connection [conn conn-spec]
  (jdbc/db-do-commands conn
                       ["drop table if exists example"
                        "create table example(
  internal_id bigint primary key,
  external_id uuid,
  height float8,
  title varchar(64) not null,
  description text,
  values bytea,
  dob date,
  created_at timestamptz
)"]))

(def data
  [{:internal_id 201223123
    :external_id #uuid "1902c205-2bc6-40b8-943b-f5b199241316"
    :height nil
    :title "Mr. Sandman"
    :description nil
    :values (.getBytes "not very secret" "UTF-8")
    :dob (java.time.LocalDate/of 1954 8 20)
    :created_at (java.util.Date.)}
   {:internal_id 2012391238
    :external_id nil
    :height 160.2
    :title "Prince"
    :description "Tonight we're gonna party"
    :values (.getBytes "2000 Two Zero" "UTF-8")
    :dob (java.time.LocalDate/of 1999 12 31)
    :created_at (java.util.Date.)}])

With clojure.java.jdbc, open a connection, prepare data rows (as tuples, not maps), and call clj-pgcopy.core/copy-into!:

(let [columns [:internal_id :external_id :height
               :title :description :values :dob :created_at]]
  (jdbc/with-db-connection [conn conn-spec]
    (pgcopy/copy-into! (:connection conn)
                       :example
                       columns
                       (map (apply juxt columns) data))))
2

The table has been populated with the data:

(jdbc/with-db-connection [conn conn-spec]
  (jdbc/query conn "table example"))
({:internal_id 201223123,
  :external_id #uuid "1902c205-2bc6-40b8-943b-f5b199241316",
  :height nil,
  :title "Mr. Sandman",
  :description nil,
  :values
  [110, 111, 116, 32, 118, 101, 114, 121, 32, 115, 101, 99, 114, 101,
   116],
  :dob #inst "1954-08-20T04:00:00.000-00:00",
  :created_at #inst "2019-07-23T01:24:38.466000000-00:00"}
 {:internal_id 2012391238,
  :external_id nil,
  :height 160.2,
  :title "Prince",
  :description "Tonight we're gonna party",
  :values [50, 48, 48, 48, 32, 84, 119, 111, 32, 90, 101, 114, 111],
  :dob #inst "1999-12-31T05:00:00.000-00:00",
  :created_at #inst "2019-07-23T01:24:38.466000000-00:00"})

Note: depending on how you’ve set up clojure.java.jdbc and its IResultSetReadColumn protocol, the types that come back on query may differ from the above.

Input Type mapping

Basic type mapping

JVM typePostgres type
Shortint2 (aka smallint)
Integerint4 (aka integer)
Longint8 (aka bigint)
Floatfloat4 (aka real)
Doublefloat8 (aka double presicion)
BigDecimalnumeric/decimal
Booleanboolean
Stringtext/varchar/char
java.util.UUIDuuid

Date-related mappings

JVM typePostgres type
java.sql.Datedate
java.time.LocalDatedate
java.util.Datetimestamp[tz]
java.sql.Timestamptimestamp[tz]
java.time.Instanttimestamp[tz]
java.time.ZonedDateTimetimestamp[tz]
java.time.OffsetDatetimetimestamp[tz]
org.postgres.util.PGIntervalinterval

Other JVM native types

JVM typePostgres type
java.net.Inet4Addressinet
java.net.Inet6Addressinet

Geometric mappings

JVM typePostgres type
org.postgres.geometric.PGpointpoint
org.postgres.geometric.PGlineline
org.postgres.geometric.PGpathpath
org.postgres.geometric.PGboxbox
org.postgres.geometric.PGcirclecircle
org.postgres.geometric.PGpolygonpolygon

Things that are String-like, or serialized in string form, should work using the String -> text mapping. An exception is the jsonb type, because the binary format requires a version signifier. Wrapping a JSON string in a JsonB handles that, which is provided by the library.

Arrays

Impemented for the following JVM-typed arrays for:

JVM typePostgres type
int[]int4[] (aka integer[])
long[]int8[] (aka bigint[])
float[]float4[] (aka real[])
double[]float8[] (aka double precision[])
byte[]bytea
String[]text[] (or varchar[])
java.util.UUID[]uuid[]

Currently, only 1-dimensional Postgres arrays are supported.

Planned But Not Yet Implemented

  • more array types (date, timestamp, etc)
  • range types

Unplanned

  • hstore
  • cidr, macaddr, macaddr8
  • bit strings
  • composite types / records
  • multi-dimensional arrays