Skip to content

ByteOrderedPartitioner

Asias He edited this page Jul 18, 2016 · 3 revisions

ByteOrderedPartitioner (BOP) is a legacy partitioner. BOP is added to Scylla only for users who already use BOP and can not migrate to the Murmur3Partitioner. New cluster deployment is recommended to use Murmur3Partitioner. BOP operates on partition key byte array lexicographically. The raw byte array value of the partition key is used to decide which nodes store the partition.

The following is an example of how partition key is ordered with ByteOrderedPartitioner in Scylla:

create KEYSPACE ks WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
create TABLE ks.cf (a text primary key, b int);
insert into ks.cf (a, b) values ('c1', 0);
insert into ks.cf (a, b) values ('a1', 0);
insert into ks.cf (a, b) values ('b1', 0);
insert into ks.cf (a, b) values ('z', 0);
insert into ks.cf (a, b) values ('g1', 0);
insert into ks.cf (a, b) values ('1', 0);
insert into ks.cf (a, b) values ('1000', 0);
insert into ks.cf (a, b) values ('2', 0);

cqlsh> SELECT * from ks.cf;

 a    | b
------+---
    1 | 0
 1000 | 0
    2 | 0
   a1 | 0
   b1 | 0
   c1 | 0
   g1 | 0
    z | 0

This is the partition key to token mapping:

partition key -> token
    1               0x31
 1000            0x31303030
    2               0x32
   a1              0x6131
   b1              0x6231
   c1              0x6331
   g1              0x6731
    z               0x7a

The query result shows it is ordered as expected.

BOP is not recommended because:

  1. load balance issue

BOP uses the byte array of the partition key to decided which node to store the partition. Normally, the partition key itself is not distributed evenly, so the data will not be distributed evenly on nodes. As a result, some nodes will host more data than others. In theory, administrators can modify the tokens of the nodes manually to get better data distribution. In practice, it needs moving node tokens according to the actual workload, it is hard to do so and imposes extra maintenance work. With multiple tables in a cluster, it is possible that different tables have different partition keys and different data distribution. It makes it even harder to modify the tokens to balance all the tables.

  1. hot spot issue

When applications write a block of data with sequential of partition keys, it is likely all the data is hosted by one node, hot spot will be generated.

Clone this wiki locally