Proposal for multi-writer consistency and CRDT for directories #531

ccxcz · 2019-01-13T20:40:33Z

The document is still work in progress but it should be enough in shape to have a meaningful discussion on. So let's get everyone's opinions in here.

This change is

…roposal detailing the underlying consistency model.

codecov-io · 2019-05-03T23:42:12Z

Codecov Report

Merging #531 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #531      +/-   ##
==========================================
- Coverage   85.29%   85.28%   -0.01%     
==========================================
  Files         152      152              
  Lines       28110    28110              
  Branches     4019     4019              
==========================================
- Hits        23977    23975       -2     
- Misses       3444     3446       +2     
  Partials      689      689

Impacted Files	Coverage Δ
src/allmydata/mutable/servermap.py	`93.7% <0%> (-0.49%)`	⬇️
src/allmydata/web/filenode.py	`94.24% <0%> (+0.3%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 495156d...15243a8. Read the comment docs.

exarkun

Have had these comments sitting in my browser for at least a month. Guess they're not doing much good there.

exarkun · 2019-05-17T13:25:32Z

docs/proposed/consistent-data.rst

+----------
+
+Currently Tahoe-LAFS doesn't guarantee any consistency on writes to mutable
+files or directories by multiple writer nodes. This is an obstacle to


This comment is mostly about me convincing myself I know what's going on here but perhaps it will be useful to others as well.

https://tahoe-lafs.readthedocs.io/en/tahoe-lafs-1.12.1/write_coordination.html is where this non-guarantee is documented.

Unfortunately it doesn't give any details about why this is the case. My understanding is that there are two aspects to the issue:

Two different versions of the ciphertext shares that can be used to reconstruct the plaintext become mixed together when considered across an entire "grid" (for n and k zfec parameters and 0 < m < k, A writes shares [0..m) vN on storage servers [0..m), B writes share [0..k) vN+1 on servers [0..k), A writes shares [m..k) on storage servers [m..k); if k < 2n then there will be fewer than n shares of either vN or vN+1 on the grid and all data is lost; if k >= 2n then there maybe be a enough shares to reconstruct both vN and vN+1 and the data a client reads will depend on which storage servers they end up ("randomly") talking to).

The plaintext data that represents (eg) a directory can be accidentally rolled back if there are uncoordinated writers (A reads, B reads, B updates, A updates).

The former of these is the more "basic" or "low-level" because the mutable directory update code is based on being able reconstruct the plaintext. The former problem will interfere with that ability. If you do happen to avoid problems at that layer, then you may encounter further problems at the "higher level" of the directory update logic - which tries to implement a kind of test-then-write logic along with a loop and an conflict resolution policy (roughly "add both" but read the code if you want to have a complete understanding, I guess).

So, that is all to say, yes, Tahoe-LAFS indeed makes no guarantees in this space, even for directory updates, even though it contains some code to try to make concurrent directory updates not break as often.

But actually I'm not sure about the former point. The way server selection works for mutable share updates might be different than what I expect, I should probably go look at it.

Jan Pobrislo and others added 7 commits January 6, 2019 21:02

First half of the convegent causally-consistent directory structure p…

e946075

…roposal detailing the underlying consistency model.

Add definition of directory CRDT including conflict resolution.

631ba2c

Clarify some directory CRDT semantics.

11c9627

Start on section describing communication of CRDT updates.

af2263c

Describe capabilites for CRDTs and communication patterns.

750f2aa

grammar and style edits

71fe598

Reformat (each sentence on a new line)

15243a8

exarkun reviewed May 23, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for multi-writer consistency and CRDT for directories #531

Proposal for multi-writer consistency and CRDT for directories #531

ccxcz commented Jan 13, 2019 •

edited by exarkun

codecov-io commented May 3, 2019

exarkun left a comment

exarkun May 17, 2019

exarkun May 18, 2019

Proposal for multi-writer consistency and CRDT for directories #531

Are you sure you want to change the base?

Proposal for multi-writer consistency and CRDT for directories #531

Conversation

ccxcz commented Jan 13, 2019 • edited by exarkun

codecov-io commented May 3, 2019

Codecov Report

exarkun left a comment

Choose a reason for hiding this comment

exarkun May 17, 2019

Choose a reason for hiding this comment

exarkun May 18, 2019

Choose a reason for hiding this comment

ccxcz commented Jan 13, 2019 •

edited by exarkun