2

I'm running a DSE 4.6.5 Cluster (Cassandra 2.0.14.352). Following datastax's guidelines, on every machine, I separated the data directory from the commitlog/saved caches directories:

  • data is on blazing fast drives
  • commit log and saved caches are on the system drives : 2 HDD RAID1

Monitoring disks with OpsCenter while performing intensive writes, I see no issue with the first, however I see the queue size from the later (commit log) averaging around 300 to 400 with spikes up to 700 requests. Of course the latency is also fairly high on theses drives ...

Is this affecting, the performance of my cluster ? Would you recommend putting the commit log and saved cache on a SSD ? separated from the system disks ?

Thanks.

Edit - Adding tpstats from one of nodes :

[root@dbc4 ~]# nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                         0         0          15938         0                 0
RequestResponseStage              0         0      154745533         0                 0
MutationStage                     1         0      306973172         0                 0
ReadRepairStage                   0         0            253         0                 0
ReplicateOnWriteStage             0         0              0         0                 0
GossipStage                       0         0         340298         0                 0
CacheCleanupExecutor              0         0              0         0                 0
MigrationStage                    0         0              0         0                 0
MemoryMeter                       1         1          36284         0                 0
FlushWriter                       0         0          23419         0               996
ValidationExecutor                0         0              0         0                 0
InternalResponseStage             0         0              0         0                 0
AntiEntropyStage                  0         0              0         0                 0
MemtablePostFlusher               0         0          27007         0                 0
MiscStage                         0         0              0         0                 0
PendingRangeCalculator            0         0              7         0                 0
CompactionExecutor                8        10           7400         0                 0
commitlog_archiver                0         0              0         0                 0
HintedHandoff                     0         1            222         0                 0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
PAGED_RANGE                  0
BINARY                       0
READ                         0
MUTATION                 49547
_TRACE                       0
REQUEST_RESPONSE             0
COUNTER_MUTATION             0

Edit 2 - sar output :

04:10:02 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:10:02 PM     all     22.25     26.33      1.93      0.48      0.00     49.02
04:20:01 PM     all     23.23     26.19      1.90      0.49      0.00     48.19
04:30:01 PM     all     23.71     26.44      1.90      0.49      0.00     47.45
04:40:01 PM     all     23.89     26.22      1.86      0.47      0.00     47.55
04:50:01 PM     all     23.58     26.13      1.88      0.53      0.00     47.88
Average:        all     21.60     26.12      1.71      0.56      0.00     50.01

1 Answers1

3

Monitoring disks with OpsCenter while performing intensive writes, I see no issue with the first,

Cassandra persists writes in memory (memtable) and on the commitlog (disk).

When the memtable size grows to a threshold, or when you manually trigger it, Cassandra will write everything to disk (flush the memtables).

To make sure your setup is capable of handling your workload try to manually flush all your memtables

nodetool flush

on a node. Or just a specific keyspace with

nodetool flush [keyspace] [columnfamilfy]

At the same time monitor your disks I/O.

If you have high I/O wait you can either share the workload by adding more nodes, or switch the data drives to better one with higher throughput.

Keep an eye to dropped mutations (can be other nodes sending the writes/hints) and dropped flush-writer.

I see the queue size from the later (commit log) averaging around 300 to 400 with spikes up to 700 requests.

This will probably be your writes to the commitlog. Is your hardware serving any other thing? Is it software raid? Do you have swap disabled?

Cassandra works best alone :) So yes, put at least, the commitlog on a separate (can be smaller) disk.

Luís Correia
  • 448
  • 4
  • 9
  • Thanks for your answer. Hardware is dedicated to cassandra. Flushing manually seems to work fine but anyway I see a lot of flushing in the logs already. Regarding CPU iowait, it does not move much (around 0.5 %). When load is higher, I do see sometimes some messages about mutations : `INFO [ScheduledTasks:1] 2015-04-15 16:21:43,229 MessagingService.java (line 875) 349 MUTATION messages dropped in last 5000ms` I'm not sure what this means ... – StormySeaSailor Apr 15 '15 at 14:43
  • 1
    About causes for dropped mutations take a look at [link](http://www.pythian.com/blog/guide-to-cassandra-thread-pools/#MutationStage). It will give you a guide on events that originate mutation and what to monitor. Be sure to cross client requests with Cassandra responses (f.e. the time of an insert with the logs for the node with the dropped mutations) – Luís Correia Apr 16 '15 at 14:11