Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaping centralization #25

Open
marcelklehr opened this issue Mar 24, 2016 · 2 comments
Open

Escaping centralization #25

marcelklehr opened this issue Mar 24, 2016 · 2 comments

Comments

@marcelklehr
Copy link
Member

Centralization is bad.
Gulf currently fosters centralization via master-slave linking.
A solution would be to support peer-to-peer linking.

(Also see the following paper on the topic: https://smartech.gatech.edu/bitstream/handle/1853/6634/GIT-CC-98-15.pdf)

Options for p2p support:

  1. Only accept OT algorithms that satisfy transform property 2
  2. Transition from OT to NoOT/CRDT
  3. Elect a master.
  4. Use an ordering strategy

1. Transform Property 2

An OT algorithm satisfies transform property 2 when it handles pseudo-ties (also called the false tie puzzle). A tie is when two people insert something at the same position. Ties are handled in standard (non-TP2 OT algorithms), by using a tie-breaking indicator, e.g. site IDs. Pseudo-ties are ties that result only from transforming operations. Pseudo-ties are only a problem if the order in which operations are transformed and applied is not coherent/deterministic, which is usually the case in peer-to-peer scenarios. See the OT FAQ for more on this.

Algorithms that feature TP2 solve the False tie puzzle. The most widely accepted solution seems to be tombstones: If you delete something you don't remove it from the model, you just mark it as deleted, resulting in a tombstone. The problem is that you have to keep every tombstone in memory. There's no easy way to gradually prune older tombstones from the model and re-insert them in case they're needed. Also, the transformation logic (what to transform against what) is much more complex than it is for simple OT.

2. Commutative Replicated Data Types

These claim to be the logical successors of OT. They are called NoOT, treedoc or logoot. The rationale is that you want append-only, commutative operations, that you don't need to transform based on their history because they can be applied in any order.

This is usually achieved by also adopting tombstones, as well as a similar concept for inserts: Instead of keeping the result of the insert (e.g. the flat text document), the document becomes a tree of inserts. This means that history is intrinsic to the operations as they need to remember the path to the leaf in the tree that they want to access. The whole document remembers all changes. This means that you can trivially decide whether an operation is based on one that is not yet known.

On the other hand, here the same problem arises: The whole document can become immensely memory-consuming and there's no easy way to compact it without losing history, although some authors have tried to remedy this by devising sophisticated compaction protocols.

(Todo: Read this: https://cs.kaist.ac.kr/upload_files/report/1254967150.pdf)

3. Elect a master

Electing a master that transforms incoming operations against missed ones, enforces order and keeps OT control algorithms simple. In fact the necessary transformations are equivalent of a three-way merge, as git does it every day.

The downside of having a master is that it represents a single point of failure (from which it is simple to recover, though), possibly a bottleneck and an easy point for applying censorship (thus the electing parties need to trust the master, which is not always possible).

4. Use an ordering strategy

The crucial problem of OT in p2p scenarios (as you have seen above) is the arbitrary, non-deterministic order of operations, which is determined by the network: you apply edits when they arrive. So, a solution would be to find a way to impose a global order on operations in p2p scenarios.

Post-commit ordering

This strategy applies an order after the edits have been sent out (the ordering key or whatever may have been assigned earlier but they aren't ordered until they are received).
The expense here is that transformation (i.e. the OT control layer; gulf in this case) is more complex (the greater the latency the greater the complexity).

Using deterministic ordering

A simple solution is to order edits by a conveniently-chosen identifier, a timestamp if you will. The timestamping algorithm must ensure causal and global (deterministic) order. A naive attempt would be datetime-siteID, but the problem is of course exactness of the datetime value. This can be solved with version vectors as timestamps. The drawback of this approach is that this is only an after-the-fact ordering: When an older edit is received, all edits that have happened in the meantime have to be undone, and transformed against the old edit (if you assume non-TP2 OT). This can be remedied by allowing peers to reject old edits, forcing the reconnecting peer to rebase. However, all peers need to consistently reject the old edit, which is hard to achieve.

Pre-commit ordering

Electing an ordering peer

A naive solution would be to elect a peer to order all operations (note that it is not necessary to transform them in any way, as 'Electing a master' suggests). The problem here is again the ordering peer becoming a bottleneck and the trust that needs to be placed in them.

Using a distributed queue algorithm

A simple and common approach is the distributed queue dual bus, but I'm not sure if it can resist manipulation by a malicious peer.

@marcelklehr
Copy link
Member Author

Exploring Vector clocks

Everytime we apply or create an operation, we increment our clock. When we apply an operation by somebody else, we update the other clocks accordingly. The ID of an operation is the current vector clock.

Legend:
A -> B: x means A sends x to B
a + b means IT(a, b) // IT transform a against b
a - b means ET(a, b) // ET transform a against b
state + a means apply(a, state)
state - a means apply(invert(a), state)
  A: clock:A0|B0|C0 history:0
  B: clock:A0|B0|C0 history:0
  C: clock:A0|B0|C0 history:0
A creates A1|B0|C0
  A: clock:A1|B0|C0 history:0,A1|B0|C0
B creates A0|B1|C0
  B: clock:A0|B1|C0 history:0,A0|B1|C0
B -> C: A0|B1|C0
C: state + A0|B1|C0
  C: clock:A0|B1|C1 history:0,A0|B1|C0
C creates A0|B1|C2
  C: clock:A0|B1|C2 history:0,A0|B1|C0,A0|B1|C2
A -> B: A1|B0|C0 // VC ordering says: A0|B1|C0 and A1|B0|C0 are independent, hence we order by site ID.
B: state - A0|B1|C0 + A1|B0|C0 + (A0|B1|C0 + A1|B0|C0)
  B: clock:A1|B2|C0 history:0,A1|B0|C0,A0|B1|C0,
A -> C: A1|B0|C0// VC ordering says: A1|B0|C0 is indepent from A0|B1|C0 and A0|B1|C2, hence we order by site ID
C: state - A0|B1|C2 - A0|B1|C0 + A1|B0|C0 + (A0|B1|C0 + A1|B0|C0) + (A0|B1|C2 + ( A1|B0|C0 + (A0|B1|C0 + A1|B0|C0) ))
  C: clock:A0|B1|C1 history:0,A1|B0|C0,A0|B1|C0,A0|B1|C2
C -> A: A0|B1|C2
A: // Does this assume A1|B0|C0 already? Clearly not (A0 in the operation's ID). It assumes B1, but we're not there yet.
  A: clock:A1|B0|C0 history:0,A1|B0|C0 waiting:A0|B1|C2
B -> A: A0|B1|C0 // A0|B1|C0 and A1|B0|C0 are independent, order by site ID.
A: state + A0|B1|C0
  A: clock:A2|B1|C0 history:0,A1|B0|C0,A0|B1|C0 waiting:A0|B1|C2 // Oh, look! Now we can apply A0|B1|C2
A: state + A0|B1|C2
  A: clock:A2|B1|C0 history:0,A1|B0|C0,A0|B1|C0,A0|B1|C2

Side note: It might work even better if we only increment our clock when changing the document ourselves. Then we can even detect messages overtaking others. I have posted a question on cs stackexchange to make sure there's no gotcha to this tweak.

Update: I've corrected some transformations.

@marcelklehr
Copy link
Member Author

My feeling was correct. This adaption is valid for changing data in a distributed manner. It's called version vecors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant