Skip to content

Repair Scylla vs Cassandra

Tzach Livyatan edited this page Jan 18, 2016 · 1 revision

On a very high level, C* repair works as follow:

  • source node calculate a Merkle tree per vnode
  • the result (root) is sent to a target node with a replication of this vnode
  • the difference is calculated, and if require the nodes stream the different data to each other

C* have two variants of repair: parallel and sequential.

  • In the first, the process above is executed from the source node to all target nodes - in parallel
  • In the first the process above is executed from the source node to target nodes one after the other - sequentially

In particular, in the first all merkle tree are calculates at the same time. The motivation for running in sequential mode is mitigating the risk of loading all nodes in the cluster at the same time. This is important as building the Merkle is a heavy operations.

Scylla have a similar process:

  • source node split each vnode to small partitions range
  • a checksum is calculated per range
  • checksum are sent to the source node
  • the difference is calculated, and if require the nodes stream the different ranges to each other

Scylla repair always works in parallel, as all target nodes are repair at the same time.

Compare to Cassandra, Scylla repair is much simpler:

  • No calculation of the full Merkle tree, just the"bottom layer" ranges.
  • No saving of Merkle trees, no need for a coarse range division
  • Scylla IO scheduler make sure repair will not affect CQL or other operations, so running a parallel repair is safe to use on run time

Comparing range check sums, not Merkle tree, might result in streaming more data between nodes. In practice, this is mitigated by choosing the right range size, and does not have significant affect.

Clone this wiki locally