Major data structures
Botond Dénes edited this page Nov 3, 2017
·
2 revisions
Not really a data structure, a cluster is a set of cooperating scylla nodes. Logically, a cluster contains all of the data. Physically, the data is spread across shards.
- Nodes
- Keyspaces: each cluster manages a number of keyspaces (which are themselves containers of tables)
A node has a 1:1 correspondence with a server. It has one IP address by which it is known. It is a unit of failure -- a node can fail and recover, be taken offline and put back offline. A node contains shards and physical storage.
- Shards: each core in a node is represented by a shard
A shard has a 1:1 correspondence with a processor core. Each shard is responsible for a number of vnodes (each representing a fraction of possible keys stored in the cluster)
- Thrift server: listens for connections on the thrift protocol
- Native server: listens for connections on the native protocol
- Metadata: replicas of all metadata in the system about keyspaces and tables (their names and structures)
- Memtables: pending dirty data in the portion of the key space (set of vnodes) belonging to the shard.
- Ring: description of the cluster topology: list of nodes, vnode<->node relationship, vnode<->key space relationship
- Commitlog: disk files containing a log of data in memtables
- Sstables