Skip to content

heckj/CRDT

Repository files navigation

CRDT

An implementation of ∂-state based Conflict-free Replicated Data Types (CRDT) in the Swift language.

codecov

code coverage chart

Overview

This library implements well-known state-based CRDTs as swift generics, sometimes described as convergent replicated data types (CvRDT). The implementation includes delta-state replication functions, which allows for more compact representations when syncing between collaboration endpoints. The alternative is to replicate the entire state for every sync.

The CRDT API documentation is hosted at the Swift Package Index.

  • G-Counter (grow-only counter)
  • PN-Counter (A positive-negative counter)
  • LWW-Register (last write wins register)
  • G-Set (grow-only set)
  • OR-Set (observed-remove set, with LWW add bias)
  • OR-Map (observed-remove map, with LWW add or update bias)
  • List (causal-tree list)

For more information on CRDTs, the Wikipedia page on CRDTs is quite good. I'd also suggest the website CRDT.tech as a wonderful collection of further resources. The implementations within this library were heavily based on algorithms described in Conflict-free Replicated Data Types by Nuno Preguiça, Carlos Baquero, and Marc Shapiro (2018), and heavily influenced/sourced from the package ReplicatingTypes, created by Drew McCormack, used under license (MIT).

What's Different about this Package

The two most notable change from Drew's code are:

  • consistently exposing the type used to identify the collaboration instance (be that person, process, or machine) as a generic type
  • adding explicit delta-state transfer mechanisms so that you didn't need to transfer the entirety of a CRDT instance to another location in order to merge the data.

Like the ReplicatingTypes package, this package is available under the MIT license for you to use as you like, asking only for recognition that it was sourced.

If your goal is creating local-first software, this implementation is start, but (in my opinion) incomplete to those needs. In particular, there are none of the serialization optimizations included that would reduce the space needed by the instances when serialized in their entirety to be stored. There are also none of the optimizations that other libraries (for example Automerge or Yjs) that improve memory overhead needed to support longer-form collaborative text interactions.

These limitations may change in the future, and contributions are welcome.

Alternative Packages and Libraries

Other Swift implementations of CRDTs:

Two very well established CRDT libraries used for collaborative text editing:

Optimizations

Articles discussing tradeoffs, algorithm details, and performance, specifically for sequence based CRDTs:

Benchmarks

Running the library:

swift run -c release crdt-benchmark library run Benchmarks/results.json --library Benchmarks/Library.json --cycles 5 --mode replace-all
swift run -c release crdt-benchmark library render Benchmarks/results.json --library Benchmarks/Library.json --output Benchmarks

Current Benchmarks

There's also stubbed benchmarks using package-benchmark under the ExternalBenchmarks directory. These additional benchmarks are primarily one-dimensional and DO require that additional libraries are installed (jemalloc) in order for them to operate. If you just want to explore, the .devContainer setting in this repository includes that library - so it's easy to trial this out from within VSCode and Docker. To explore the 1-dimension external benchmarks:

cd ExternalBenchmarks
swift package benchmark