According to the documentation, In Cassandra, tombstones are removed during compaction. And it seems that there are some conditions to be removed well.
- after gc_grace_seconds
- Compaction include a sstable which contains tombstone and other sstables which contains the record for tombstone target.
And let`s assume the following situation.
- the partition key of this table is primary key.
- use SizeTieredCompactionStrategy.
- record with partition A created at 2023-01-01 and existed on sstable_1 and the size of sstable is 200MB.
- partition A deleted at 2023-02-02 and a tombstone of partition A exists on sstable_2 and the size of this sstable is 20MB.
- Time passed and the tombstone expired(=after gc_grace_seconds).
- Compaction triggered and the compaction dit not merge these two sstable(sstable_1, sstable_2) together because of the gap of size. Instead, the comapction merge sstable_2 with other sstables which is similar size.
In this situation, I guess the tombstone of partition A will be alive because older data remain in another sstable despite gc_grace_seconds passed. Is it right?
And I have one more question. If it is right, it means that Cassandra have the ability to perceive whether the other sstable has older data than tombstone or not. So why doesn't Cassandra try to merge these two sstables(sstable_1, sstable_2) regardless of the compaction strategy even though it has this ability?