0

Wonder whether Cassandra is triggering a compaction (STCS or LCS) based on the number of deletion in sstables? In LCS, as I know, cassandra compacts sstables to next level only if a level is full. But the size of a deletion recored is usually small. If just consider the sstable size to decide whether a level is full or not, it may take long for a tombstone to be reclaimed.

I know rocksdb is triggering compaction using the number of deletions in sstables. This will help to reduce tombstone.

1 Answers1

0

Yes, Cassandra's compaction can be triggered by the number of deletion (a.k.a. tombstones)

Have a look to the common options for all the compaction strategies and specifically this param:

tombstone_threshold

How much of the sstable should be tombstones for us to consider doing a single sstable compaction of that sstable.

See doc here: https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/index.html

Jeremy
  • 61
  • 4
  • Thanks Jeremy for the answer. The option is used for single sstable compaction only, so it can only reclaim the tombstone in the highest level, right? In my case, there are 2 files: the older sstable 1 is 100M which includes 25600 records, the newer sstable 2 is 2M including 25600 deletion records. I wish a compaction can be triggered to reclaim the space. tombstone_threshold will not work in this case, right? – user20621111 Jan 19 '23 at 00:02
  • No, you are right, it will trigger a single sstable compaction and in the case you described, all the droppable tombstones are in overlapping ranges in multiple sstables. So I do not think there is a good way to do this with Cassandra. Depending on the compaction you are using, have a look to the different parameter to be more aggressive on compaction triggering. Like min_threshold for STCS. – Jeremy Jan 19 '23 at 21:57