-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: add metric for sstable compression types #3623
*: add metric for sstable compression types #3623
Conversation
888cc95
to
8b5f741
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add this to the FileMetadata
and the manifest? The default can be "unknown" but we can start populating it for new tables.
Reviewable status: 0 of 10 files reviewed, all discussions resolved (waiting on @nicktrav)
@RaduBerinde We could, but then we'd also have to shop it around with external/shared files. I felt that it's an emergent property of the blocks in the same way as a lot of other properties etc., are, so it's best to leave it there for semantic separation from the stuff that's usually in FileMetadata. I'm open to doing that if we think that's the preferred approach, though |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, just a thought.
Reviewable status: 0 of 10 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @nicktrav)
metrics.go
line 609 at r1 (raw file):
humanize.Bytes.Uint64(m.VirtualSize())) w.Printf("Local tables size: %s\n", humanize.Bytes.Uint64(m.Table.Local.LiveSize)) w.Printf("Compression types: unknown: %d snappy: %d zstd: %d\n",
I think we should hide the 0 counts to make it more readable. I'd also format it as 1 snappy, 2 zstd, 3 unknown
.
table_stats.go
line 1164 at r1 (raw file):
vptr.zstd++ default: vptr.unknown++
Are we conflating NoCompression into unknown? I think we should either count it separately, or don't count it at all.
table_stats.go
line 1185 at r1 (raw file):
return } compression := v.Levels[level].Annotation(compressionTypeAnnotator{}).(*compressionTypes)
This annotator functionality is pretty cool!
Previously, RaduBerinde wrote…
By the way, I think we should explicitly put |
Previously, we didn't have a metric tracking the compression type(s) of sstables. This change adds that, as a table stat that's collected through the LevelMetadata as an annotator field. This unfortunately had to be a table stat as the compression type isn't known until we open a reader and look at the properties block. Pebble side of cockroachdb/cockroach#123952.
8b5f741
to
79e07c9
Compare
@RaduBerinde - re-pushed to address the formatting and the "no compression" case. One thing I'm noticing is that if we restart a DB, all the files will have compression type unknown, even after we do the async stats loading. That is unfortunate, as we'll only update the compression types when a table is rewritten via a compaction. I'll look at fixing that. |
@nicktrav are we seeing that in practice? The cacheable return value from the annotator should be false until the stats are marked as valid, so we shouldn’t be caching the unknowns in the tree. Sorry for not wrapping this up Wednesday! |
I think we are. For example, in this patch, look at I think the issue is that in A simple solution is to move the compression type up into that struct, but I'm having some issues getting the randomized tests to pass. |
Pushed a fix for getting the compression stats populated in an existing DB that is opened. I think we need this, else our plans to monitor the transition from one compression type to another would only take into account new sstables (flushed, compacted, ingested, etc.). Using @RaduBerinde - mind taking another pass? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 15 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @nicktrav)
sstable/properties.go
line 133 at r3 (raw file):
ComparerName string `prop:"rocksdb.comparator"` // The compression options used to compress blocks. CompressionOptions string `prop:"rocksdb.compression_options"`
CompressionName and CompressionOptions belong together. I think it's ok if both live in CommonProperties
Currently, the compression type for a table is known when the SST is new \- i.e. flushed, compacted, ingested. However, when opening an existing DB, the compression types will show as "unknown". Allow the async, background stats loading job that runs on DB open to infer the compression type of existing tables by lifting the compression field from `sstable.Properties` up into `sstable.CommonProperties`. This allows `loadTableStats` to use the field and populate the `manifest.TableStats` struct.
0662fc9
to
4b8dec1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 15 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @RaduBerinde)
metrics.go
line 609 at r1 (raw file):
Previously, RaduBerinde wrote…
I think we should hide the 0 counts to make it more readable. I'd also format it as
1 snappy, 2 zstd, 3 unknown
.
Done.
sstable/properties.go
line 133 at r3 (raw file):
Previously, RaduBerinde wrote…
CompressionName and CompressionOptions belong together. I think it's ok if both live in
CommonProperties
Done.
TFTR! |
Previously, we didn't have a metric tracking the compression type(s) of sstables. This change adds that, as a table stat that's collected through the LevelMetadata as an annotator field. This unfortunately had to be a table stat as the compression type isn't known until we open a reader and look at the properties block.
Pebble side of cockroachdb/cockroach#123952.