2

I have a Cassandra cluster which is having gc_grace_seconds 10 days. auto compaction is enabled and running as per configuration but I am suspecting that auto compaction is not clearing the tombstones which are expired gc_grace_seconds duration(10 days). I am planning to run a major compaction on that table so my questions are.

1) Should I run major compaction without changing gc_grace_seconds 10 days?

2) Should I run major compaction changing gc_grace_seconds 0 days?

3) If I am changing gc_grace_seconds 0 so is it applicable for future data or already existing data with days gc_grace_seconds as well?

Thanks in advance.

LetsNoSQL
  • 1,478
  • 1
  • 11
  • 23
  • What makes you want to clear tombstones, are you having a read latency issue because of them? i.e. treat the root cause not the symptom. Tombstones arent bad, just too many of them and incur performance problems – markc Jan 17 '19 at 13:22
  • Yes, we are facing read latency issue. – LetsNoSQL Jan 18 '19 at 07:57
  • Can you add to your question: table schema, sample cfstats output for this table, confirm if you see tombstone warning messages in the logs. Sample cqlsh trace. – markc Jan 18 '19 at 10:12
  • you might also want to have a look at this blog: https://academy.datastax.com/support-blog/cleaning-tombstones-datastax-dse-and-apache-cassandra – markc Jan 18 '19 at 10:14

2 Answers2

1

First of all, you shouldn't set gc_grace_seconds to 0, unless on a single-node cluster. If gc_grace_seconds is set to some period, you must run repair at least once in every such period, otherwise you have a risk of data resurrection - which happens when one node on the cluster missed a deletion, and other nodes drop their tombstones, so a later repair will think the data is new and not realize it had been deleted already. If you ever set gc_grace_seconds to 0, any data which you previously deleted may be resurrected on the next repair, if the data happens to be on one of the replicas (because this specific replica missed the deletion because of some temporary problem).

So yes, the correct approach would be to run a major compaction with the original gc_grace_seconds of 10 days (and be sure to do a repair at least once every 10 days).

But you need to consider why you want to run a major compaction at all. Whether or not minor compaction can get rid of old (past 10 day) tombstones depends on a lot of factors, such as whether you recently made other modifications to the same partition that these tombstones sit in. But unless the tombstones are causing you major problems (tons of disk space, slower reads, etc.), it might not be worthwhile to do a major compaction. Major compaction is not free, and (at least in size-tiered compaction strategy) after it, all the data is sitting in one file and will take even longer until it can be compacted again.

Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45
  • Thanks Nadav. Yes, gc_grace_seconds=0 is harmful for Cassandra cluster and major compaction is not recommended as well but if I want to run major compaction on Cassandra cluster where gc_grace_seconds 10days already set so it will clear all the tombstones or before running major compaction need to set 0 day? In both conditions which one is very helpful and why? – LetsNoSQL Jan 19 '19 at 02:53
1

1) Should I run major compaction without changing gc_grace_seconds 10 days?

Yes. If set to 0 the tombstones will not be propagated to other nodes in the cluster. Which causes inconsistency in data.

3) If I am changing gc_grace_seconds 0 so is it applicable for future data or already existing data with days gc_grace_seconds as well?

If you change gc_grace_seconds it will be applicable to future data as well the current data.

If you want to clear tombstones by compacting I have two options for you

1) nodetool compact -s keyspace table

This will compact table and create sstables which are 50%-25%-12.5% and so on

2) nodetool compact --user-defined path/to/sstable

This will clear the tombstones from the sstable that is mentioned above.

Nama
  • 163
  • 1
  • 7
  • As mentioned by you...changing gc_grace_seconds will be applicable for both current and future deletes. Simple query:- I deleted data on Cassandra with gc_grace_seconds 10 days. Cassandra created tombstones and waiting for gc_grace_seconds expiration. Now, I have run major compaction on 2nd day after deletion(8 days still remaining for gc_grace_seconds expiration). 1)What compaction will do? It will consider all tombstones and clear which deleted 2 days before or not? 2) I should change gc_grace_seconds=0 before running major compaction? – LetsNoSQL Feb 28 '19 at 03:51
  • 1
    Compaction will with hold data which is most recently updated and it will discard all other updates(since sstables are immutable it contains other updates of same data).Even if you run compaction with in gc_grace_seconds the tombstones will not be deleted. The tombstones will only be deleted if they are expired i.e if they pass the gc_grace_seconds. So if run compaction on 2nd day the tombstones will not be deleted they will be deleted if you run compaction after 10 days. You can change the gc_grace_seconds to 0 but there will inconsistencies in data if you have a cluster and is not advisable. – Nama Feb 28 '19 at 05:24
  • What will be? if change gc_grace_seconds 0 on 2nd day and run major compaction. it will clear all delete all tombstones which deleted 2 days ago? – LetsNoSQL Feb 28 '19 at 08:48