Skip to content

SSTables C* 2.2 Format

hagrid-the-developer edited this page Feb 13, 2018 · 12 revisions

Cassandra 2.2 format is very similar to 2.1 format, it differs in file names (la format) and in a fix of an issue with parallel flushes of the same SSTable (lb format), this issue is related to commit log.

la format

Related C* issue: https://issues.apache.org/jira/browse/CASSANDRA-7443. This change appeared in version 2.2.0.

Goal of this change was to add support for different file names and for more SSTables formats. First table format: big or BigTable was introduced. The format seems to be used to represent storage engine so there can be in theory more storage engines.

Old file names looked this way: keyspace-table-version-generation-componentname (like xyz-events-ia-1-Data.db). New file names look this way: version-generation-format-componentname (like lb-1-big-Data.db).

All following references to sources are from C* 2.2.x.

BigFormat’s sources are in src/java/org/apache/cassandra/io/sstable/format/big . Basic class for SSTables formats is in src/java/org/apache/cassandra/io/sstable/format/SSTableFormat.java.

File src/java/org/apache/cassandra/io/sstable/Descriptor.java implements SSTable descriptor that stores SSTable’s format, version, column-family, keyspace, whether a table is temporary, et cetera. Method Descriptor.appendFileName() is used for generation of the file name.

File src/java/org/apache/cassandra/io/sstable/format/Version.java contains various abstract methods that introduce properties of the particular SSTable Format and Version (eg. whether it has new file names) and contains various helper methods (equals() or validate() for validation of version-strings).

lb format

Related C* issue: https://issues.apache.org/jira/browse/CASSANDRA-9669.

There was a strange issue related to commit log, that exhibits in the following scenarion:

  1. Memtable of SStable is being flushed.
  2. In the meantime, another memtable is created and C* starts to flush it too.
  3. Flush of the second memtable finishes before the flush of the first.
  4. Server crashes before the first memtable’s flush is finished.
  5. Then CL records in the first commit log aren’t replayed and get lost.

Scylla's commit log no longer uses sstable replay positions as watermarks for replay, only truncation positions. So this issue should be already fixed in Scylla.

We also don't read C* commit log files so it is not necessary to process lb format in a special way.

Clone this wiki locally