New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad estimation for temporary sstables (containing GCed data only) causes worst case of 2x space amplification for filters #18283
Labels
Backport candidate
backport/5.2
Issues that should be backported to 5.2 branch once they'll be fixed
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
backport/6.0
Milestone
Comments
raphaelsc
added
backport/5.2
Issues that should be backported to 5.2 branch once they'll be fixed
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
backport/6.0
labels
Apr 17, 2024
/cc @denesb |
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
Apr 29, 2024
…sstables When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for both the regular output sstables and the temporary sstables. Instead split the partition estimate between the two based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
Apr 29, 2024
…sstables When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for both the regular output sstables and the temporary sstables. Instead split the partition estimate between the two based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 3, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 3, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 6, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 8, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 9, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 9, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
mergify bot
pushed a commit
that referenced
this issue
May 11, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes #18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit 36b98d8) # Conflicts: # compaction/compaction.cc
mergify bot
pushed a commit
that referenced
this issue
May 11, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes #18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> (cherry picked from commit 36b98d8)
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 13, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb#18465 (cherry picked from commit d39adf6)
lkshminarayanan
added a commit
to lkshminarayanan/scylladb
that referenced
this issue
May 13, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes scylladb#18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb#18465 (cherry picked from commit d39adf6)
denesb
pushed a commit
that referenced
this issue
May 14, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes #18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes #18465 (cherry picked from commit d39adf6) Closes #18656
denesb
pushed a commit
that referenced
this issue
May 14, 2024
When a compaction strategy uses garbage collected sstables to track expired tombstones, do not use complete partition estimates for them, instead, use a fraction of it based on the droppable tombstone ratio estimate. Fixes #18283 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes #18465 (cherry picked from commit d39adf6) Closes #18659
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Backport candidate
backport/5.2
Issues that should be backported to 5.2 branch once they'll be fixed
backport/5.4
Issues that should be backported to 5.4 branch once they'll be fixed
backport/6.0
We're incorrectly feeding same estimation for regular output sstables in compaction as for temporary sstables containing GCed-data only. That's a way of saying that the output can twice as much keys as input, which is clearly wrong. The output is segregated into either regular output or temporary.
The consequence is that during compaction filters can take twice as much ram space.
What we should do is that regular output will have input_key_estimation * estimated_live_data_ratio. And temporary will have input_key_estimation * (1 - estimated_live_data_ratio).
The text was updated successfully, but these errors were encountered: