ByteOrderedPartitioner
ByteOrderedPartitioner (BOP) is a legacy partitioner. BOP is added to Scylla only for users who already use BOP and can not migrate to the Murmur3Partitioner. New cluster deployment is recommended to use Murmur3Partitioner. BOP operates on partition key byte array lexicographically. The raw byte array value of the partition key is used to decide which nodes store the partition.
The following is an example of how partition key is ordered with ByteOrderedPartitioner in Scylla:
create KEYSPACE ks WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
create TABLE ks.cf (a text primary key, b int);
insert into ks.cf (a, b) values ('c1', 0);
insert into ks.cf (a, b) values ('a1', 0);
insert into ks.cf (a, b) values ('b1', 0);
insert into ks.cf (a, b) values ('z', 0);
insert into ks.cf (a, b) values ('g1', 0);
insert into ks.cf (a, b) values ('1', 0);
insert into ks.cf (a, b) values ('1000', 0);
insert into ks.cf (a, b) values ('2', 0);
cqlsh> SELECT * from ks.cf;
a | b
------+---
1 | 0
1000 | 0
2 | 0
a1 | 0
b1 | 0
c1 | 0
g1 | 0
z | 0
This is the partition key to token mapping:
partition key -> token
1 0x31
1000 0x31303030
2 0x32
a1 0x6131
b1 0x6231
c1 0x6331
g1 0x6731
z 0x7a
The query result shows it is ordered as expected.
BOP is not recommended because:
- load balance issue
BOP uses the byte array of the partition key to decided which node to store the partition. Normally, the partition key itself is not distributed evenly, so the data will not be distributed evenly on nodes. As a result, some nodes will host more data than others. In theory, administrators can modify the tokens of the nodes manually to get better data distribution. In practice, it needs moving node tokens according to the actual workload, it is hard to do so and imposes extra maintenance work. With multiple tables in a cluster, it is possible that different tables have different partition keys and different data distribution. It makes it even harder to modify the tokens to balance all the tables.
- hot spot issue
When applications write a block of data with sequential of partition keys, it is likely all the data is hosted by one node, hot spot will be generated.