GitHub - nurturenature/jepsen-causal-consistency: Jepsen Tests for Causal Consistency

Jepsen Tests for Causal Consistency

Designed for testing local first systems, CRDTs, and distributed syncing.

Jepsen has an established history of testing databases.

These tests have focused on stronger levels of consistency, e.g. snapshot-isolation, linearizability, and serializability.

This project explores using Jepsen to test for Causal Consistency, with Strong Convergence, and atomic transactions (Monotonic Atomic View).

Uses Elle, Jepsen's Transactional Consistency Checker

Adopts Adya's Consistent View(PL-2+) as the base consistency model

Level PL-2+ ensures that a transaction is placed after all transactions that causally affect it, i.e., it provides a notion of “causal consistency”.

-- Adya, Weak Consistency

Modifies Elle's consistency model graph

Causal Consistency requires process order.

But Adya doesn't include process order and anomalies with many models, importantly to us, Consistent View(PL-2+).

Elle has already added a strong-session, process order and process anomalies, to most stronger models.

So we complete Elle's existing consistency models by adding a strong-session-consistent-view:

adds process graph and process variants of anomalies
fills in the gap between existing stronger and weaker forms of strong-session consistency models

Building the Directed Acyclic Graph of Causal Dependencies

We combine:

process order
w->r order, the write of [k v] happens before all reads of [k v]
w->w order, a write of [k v] is ordered after the writes of all previously observed [k v] in the process
- (writes follow reads)
r->w order, infer that all reads that don't observe [k v] happen before the write of [k v]
- ordering edges are only created for one process at a time
- each process requires checking itself against the full graph for cycles
  - multiple cycle checks are expensive
- necessary for causal to reflect a per process view of other processes
r->r order, inferred through combining process + wr + ww + rw
- use of process + wr + ww + rw is
  - better at catching anomalies across diverse keys in transactions
  - provides more explicit explanation of cycle dependencies

Adya Anomalies Expressed

Read Your Writes

[:G-single-item-process]

[{:process 0, :type :ok, :f :txn, :value [[:w :x 0]],   :index 1}
 {:process 0, :type :ok, :f :txn, :value [[:r :x nil]], :index 3}]

G-single-item-process
Let:
  T1 = {:index 3, :time -1, :type :ok, :process 0, :f :txn, :value [[:r :x nil]]}
  T2 = {:index 1, :time -1, :type :ok, :process 0, :f :txn, :value [[:w :x 0]]}

Then:
  - T1 < T2, because T1's read of [:x nil] did not observe T2's write of [:x 0] (r->w).
  - However, T2 < T1, because process 0 executed T2 before T1: a contradiction!

Monotonic Writes

[:G-single-item-process]

[{:process 0, :type :ok, :f :txn, :value [[:w :x 0]], :index 1}
 {:process 0, :type :ok, :f :txn, :value [[:w :x 1]], :index 3}
 {:process 1, :type :ok, :f :txn, :value [[:r :x #{1}]], :index 5}
 {:process 1, :type :ok, :f :txn, :value [[:r :x #{0 1}]], :index 7}]

G-single-item-process
Let:
  T1 = {:index 5, :time -1, :type :ok, :process 1, :f :txn, :value [[:r :x #{1}]]}
  T2 = {:index 1, :time -1, :type :ok, :process 0, :f :txn, :value [[:w :x 0]]}
  T3 = {:index 3, :time -1, :type :ok, :process 0, :f :txn, :value [[:w :x 1]]}

Then:
  - T1 < T2, because T1's read of [:x #{1}] did not observe T2's write of [:x 0] (r->w).
  - T2 < T3, because process 0 executed T2 before T3.
  - However, T3 < T1, because T3's write of [:x 1] was read by T1 (w->r): a contradiction!

Monotonic Reads

[:G-single-item-process]

[{:process 0, :type :ok, :f :txn, :value [[:w :x 0]], :index 1}
 {:process 1, :type :ok, :f :txn, :value [[:w :x 1]], :index 3}
 {:process 2, :type :ok, :f :txn, :value [[:r :x #{0}]], :index 5}
 {:process 2, :type :ok, :f :txn, :value [[:r :x #{0 1}]], :index 7}
 {:process 2, :type :ok, :f :txn, :value [[:r :x #{1}]], :index 9}]

G-single-item-process
Let:
  T1 = {:index 9, :time -1, :type :ok, :process 2, :f :txn, :value [[:r :x #{1}]]}
  T2 = {:index 1, :time -1, :type :ok, :process 0, :f :txn, :value [[:w :x 0]]}
  T3 = {:index 7, :time -1, :type :ok, :process 2, :f :txn, :value [[:r :x #{0 1}]]}

Then:
  - T1 < T2, because T1's read of [:x #{1}] did not observe T2's write of [:x 0] (r->w).
  - T2 < T3, because T2's write of [:x 0] was read by T3 (w->r).
  - However, T3 < T1, because process 2 executed T3 before T1: a contradiction!

Writes Follow Reads

[:G-single-item-process :G-single-item ]

[{:process 0, :type :ok, :f :txn, :value [[:w :x 0]], :index 1}
 {:process 1, :type :ok, :f :txn, :value [[:r :x #{0}]], :index 3}
 {:process 1, :type :ok, :f :txn, :value [[:w :y 0]], :index 5}
 {:process 2, :type :ok, :f :txn, :value [[:r :y #{0}]], :index 7}
 {:process 2, :type :ok, :f :txn, :value [[:r :x nil]], :index 9}]

G-single-item-process
Let:
  T1 = {:index 9, :time -1, :type :ok, :process 2, :f :txn, :value [[:r :x nil]]}
  T2 = {:index 1, :time -1, :type :ok, :process 0, :f :txn, :value [[:w :x 0]]}
  T3 = {:index 5, :time -1, :type :ok, :process 1, :f :txn, :value [[:w :y 0]]}
  T4 = {:index 7, :time -1, :type :ok, :process 2, :f :txn, :value [[:r :y #{0}]]}

Then:
  - T1 < T2, because T1's read of [:x nil] did not observe T2's write of [:x 0] (r->w).
  - T2 < T3, because T2's write of [:x 0] was observed by process 1 before it executed T3 (wfr).
  - T3 < T4, because T3's write of [:y 0] was read by T4 (w->r).
  - However, T4 < T1, because process 2 executed T4 before T1: a contradiction!

writes follow reads G-single-item-process

write -> read

[:G1c-process, :G0]

[{:process 0, :type :ok, :f :txn, :value [[:r :x #{0}]], :index 1, :time -1}
 {:process 0, :type :ok, :f :txn, :value [[:w :x 0]], :index 3, :time -1}]

G1c-process
Let:
  T1 = {:index 1, :time -1, :type :ok, :process 0, :f :txn, :value [[:r :x #{0}]]}
  T2 = {:index 3, :time -1, :type :ok, :process 0, :f :txn, :value [[:w :x 0]]}

Then:
  - T1 < T2, because process 0 executed T1 before T2.
  - However, T2 < T1, because T2's write of [:x 0] was read by T1 (w->r): a contradiction!

Internal

[:internal]

[{:process 0, :type :ok, :f :txn, :value [[:w :x 0]], :index 1}
 {:process 1, :type :ok, :f :txn, :value [[:r :x #{0}] [:w :x 1] [:r :x #{0}]], :index 3}
 {:process 2, :type :ok, :f :txn, :value [[:w :x 2] [:r :x #{0 1}]], :index 5}]

{:internal
 ({:op {:process 1, :type :ok, :f :txn, :value [[:r :x #{0}] [:w :x 1] [:r :x #{0}]], :index 3, :time -1},
   :mop [:r :x #{0}],
   :expected #{0 1}}
  {:op {:process 2, :type :ok, :f :txn, :value [[:w :x 2] [:r :x #{0 1}]], :index 5, :time -1},
   :mop [:r :x #{0 1}],
   :expected #{2}})}
:not #{:read-atomic}

G1a (aborted read)

[:G1a]

[{:process 0, :type :ok,   :f :txn, :value [[:w :x 0]], :index 1}
 {:process 0, :type :fail, :f :txn, :value [[:w :x 1]], :index 3}
 {:process 1, :type :ok,   :f :txn, :value [[:r :x #{0 1}]], :index 5}]

{:G1a
 ({:writer {:process 0, :type :fail, :f :txn, :value [[:w :x 1]], :index 3, :time -1},
   :readers #{{:process 1, :type :ok, :f :txn, :value [[:r :x #{0 1}]], :index 5, :time -1}},
   :read-of-failed [:x #{1}]})}
:not #{:read-committed}

G1b (intermediate read)

[:G1b]

[{:process 0, :type :ok, :f :txn, :value [[:w :x 0] [:w :x 1]], :index 1}
 {:process 1, :type :ok, :f :txn, :value [[:r :x #{0}]], :index 3}]

{:G1b
 ({:writer {:process 0, :type :ok, :f :txn, :value [[:w :x 0] [:w :x 1]], :index 1},
   :readers #{{:process 1, :type :ok, :f :txn, :value [[:r :x #{0}]], :index 3}},
   :missing [:x #{1}]})}
:not #{:read-committed}

Type Sanity

All read/write objects in the history are checked for type sanity.

For a grow only set:

keys are integers
set elements are integers

Throws an exception as a type sanity violation invalidates the test.

Issues, Impedance, and Friction with Adya

language has evolved over time
not a 1-to-1 mapping between the definition of Causal Consistency, its components, and Adya's models

Lost Update Anomaly

Lost update is a violation of Consistent View yet is a valid Causal history.

The update isn't lost, it's eventually and consistently merged.

; Adya's Hlost: r1 (x0, 10) r2(x0 , 10) w2(x2 , 15) c2 w1(x1 , 14) c1
;               [x0 << x2 << x1 ]
[{:process 1 :type :invoke :value [[:r :x nil] [:w :x 14]] :f :txn}
 {:process 2 :type :invoke :value [[:r :x nil] [:w :x 15]] :f :txn}
 {:process 2 :type :ok     :value [[:r :x 10]  [:w :x 15]] :f :txn}
 {:process 1 :type :ok     :value [[:r :x 10]  [:w :x 14]] :f :txn}]

Challenges of Checking Causal Consistency

Clients can exhibit a variety of behaviors that hide replication failures. Their local view remains consistent until they are forced to merge with the system as a whole.

Isolationist

only reads/writes from/to local database
does not replicate local changes
does not listen to other client's updates

Selfish

does not replicate local changes
does listen to other client's updates

Cliquish

only replicates to a subset of clients
only listens to a subset of other client's updates

A final read followed by an additional check for Strong Convergence is necessary to expose these anomalies.

Strong Convergence

Workload:

generate a random mixture of reads and writes across all clients
let database quiesce
each client does a final read of all keys

Check:

all nodes have ok final reads
final reads contain all ok writes
no unexpected read values
final reads are equal for all nodes

Fault Injection

Jepsen faults are real faults:

kill (-9) the sync service on each node
- clients continue to read/write to the database
- sync service restarted

Less of a fault, and more indicative of normal behavior.

In a local first environment, clients will be coming and going in all manner at all times.

But What About On Verifying Causal Consistency (POPL'17) ?!?

Paraphrasing from the abstract:

checking whether one single execution is causally consistent is NP-complete

verifying whether all the executions of an implementation are causally consistent is undecidable

for a read-write memory abstraction, these negative results can be circumvented if the implementations are data independent, use differentiated histories

Bouajjani, A., Enea, C., Guerraoui, R., & Hamza, J. (2017). On verifying causal consistency. ACM SIGPLAN Notices, 52(1), 626–638. doi:10.1145/3093333.3009888

The authors also introduce a new vocabulary for different levels of causal consistency and "bad patterns" that define them.

Working through the examples in the paper, it appears that using a grow only set with a differentiated history also provides the efficiency gains exclusively using a much simpler graphing convention.

It also reduces the newly introduced consistency levels and "bad patterns" into the more common colloquial definitions and anomalies of Causal Consistency:

Examples from the paper as a grow only set:

read your writes
- (a) CM but not CCv
- (c) CC but not CM nor CCv
writes follow reads
- (b) CCv but not CM
writes follow reads or monotonic reads
- (e) not CC (nor CM, nor CCv)
making an argument that "sequentially consistent" is outside the common meaning of Causal Consistency
- (d) CC, CM and CCv but not sequentially consistent

Experiences of trying to write a checker for a last write wins register using only graphing confirms the paper's assertions that it is hard and expensive.

Elle Consistency Model Graph Changes

Look for strong-session-PL-2+:

Opts to Configure Elle for Causal Consistency

(def causal-opts
  {:consistency-models [:strong-session-consistent-view] ; Elle's strong-session with Adya's Consistent View(PL-2+)
   :anomalies          [:internal]                       ; basic hygiene to read your writes in a transaction
   :anomalies-ignored  [:lost-update]                    ; `lost-update`s are causally Ok, but they are PL-2+, Adya 4.1.3 ?!?
   })

Testing ElectricSQL

The test will be developed using ElectricSQL:

transactional causal+ consistency
local first
active/active SQLite3/PostgreSQL CRDT based sync
strong research team

Some oh so preliminary results.

Name		Name	Last commit message	Last commit date
Latest commit History 228 Commits
.github		.github
doc		doc
docker		docker
electricsql		electricsql
local-sqlite3		local-sqlite3
lxd		lxd
src/causal		src/causal
test/causal		test/causal
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

License

nurturenature/jepsen-causal-consistency

Folders and files

Latest commit

History

Repository files navigation

Jepsen Tests for Causal Consistency

Uses Elle, Jepsen's Transactional Consistency Checker

Adopts Adya's Consistent View(PL-2+) as the base consistency model

Modifies Elle's consistency model graph

Building the Directed Acyclic Graph of Causal Dependencies

Adya Anomalies Expressed

Read Your Writes

Monotonic Writes

Monotonic Reads

Writes Follow Reads

write -> read

Internal

G1a (aborted read)

G1b (intermediate read)

Type Sanity

Issues, Impedance, and Friction with Adya

Lost Update Anomaly

Challenges of Checking Causal Consistency

Isolationist

Selfish

Cliquish

Strong Convergence

Fault Injection

But What About On Verifying Causal Consistency (POPL'17) ?!?

Elle Consistency Model Graph Changes

Opts to Configure Elle for Causal Consistency

Testing ElectricSQL

About

Topics

Resources

License

Stars

Watchers

Forks

Languages