5

I have a program intensively read and write (same amount of read and write, for write, 4/5 update and 1/5 insert). Is SizedTired compaction better than Leveled one?

Also most of data have TTL 7 days and others are 1 day. In this case, is Time Window strategy preferred?

SilentCanon
  • 626
  • 4
  • 11

2 Answers2

8

Timewindow isn't a good fit since you have updates which make it less ideal. Sizetier performs the best with the cost of more volume usage. Check the table for compaction algorithm selection here: https://www.scylladb.com/webinar/on-demand-webinar-best-practices-for-data-modeling/

Usually STCS is the best default

dor laor
  • 870
  • 4
  • 4
1

LeveledCompactionStrategy with updates like that best bet especially with mixed reads like that.

Chris Lohfink
  • 16,150
  • 1
  • 29
  • 38
  • 6
    LCS traditionally has issues with workloads that have both updates and TTLs: if your tombstones end up in the last level, the update-heavy nature of the workload will keep most of the SSTable promotions in the lower levels, and you may not be able to ever expire your tombstones. Also for 50% writes (regardless of whether this is updates or inserts), LCS will likely be very expensive. – Glauber Costa Aug 09 '19 at 19:50
  • Just incase anyone comes here with Apache Cassandra (since tagged cassandra) -- LCS (and TWCS) for Cassandra at least is _particularly for_ heavy updates and TTLs (TWCS more for time series and ttl). STCS is the one with the problem you mention, ie 1tb sstables with obsolete data not getting compacted since updated data exists in almost all sstables so reads very expensive. Single sstable tombstone compactions sometimes cant purge the ttl/tombstones in last level due to possible shadowing in other tables. Scylla may be different of course and I defer to the experts for it. – Chris Lohfink Aug 12 '19 at 19:13