How can we find large partitions on our cassandra cluster before came into system.log? we are facing some performance issue due to this. Can anyone help me. We have cassandra version 2.0.11 and 2.1.16.
2 Answers
You can look into output of the nodetool tablestats
(or nodetool cfstats
in the older versions of Cassandra) - for every table it has line Compacted partition maximum bytes together with other information, like in this example when max partition size is about 268Mb:
Table: table_name
SSTable count: 2
Space used (live): 147638509
Space used (total): 147638509
.....
Compacted partition minimum bytes: 43
Compacted partition maximum bytes: 268650950
Compacted partition mean bytes: 430941
Average live cells per slice (last five minutes): 8256.0
Maximum live cells per slice (last five minutes): 10239
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
.....
But nodetool tablestats
gives you an information for current node only, so you'll need to execute it on every node of the cluster.
Update: You can find largest partitions using different tools:
- https://github.com/tolbertam/sstable-tools has describe command that shows largest/widest partitions. This command will be also available in Cassandra 4.0.
- for DataStax products the DSBulk tool supports counting of partitions.

- 80,552
- 8
- 87
- 132
-
Thanks. Yes, but require keys of those partitions? any command to find if any – LetsNoSQL Dec 28 '18 at 11:44
-
Thanks Alex, I have seen but seems available for 3.x but as I have mentioned that we have 2.x. Also, we require keys of wide or large partitions so that we can handle before coming into compaction. – LetsNoSQL Dec 28 '18 at 17:33
-
Okay, will do. Thanks – LetsNoSQL Dec 29 '18 at 03:16
Try nodetool tablehistograms -- <keyspace> <table>
command provides statistics about a table, including read/write latency, partition size, column count, and number of SSTables.
Below is the example output:
Percentile SSTables Write Latency Read Latency Partition Size Cell Count
(micros) (micros) (bytes)
50% 0.00 73.46 0.00 223875792 61214
75% 0.00 88.15 0.00 668489532 182785
95% 0.00 152.32 0.00 1996099046 654949
98% 0.00 785.94 0.00 3449259151 1358102
99% 0.00 943.13 0.00 3449259151 1358102
Min 0.00 24.60 0.00 5723 4
Max 0.00 5839.59 0.00 5960319812 1955666
This provides proper stats of the table like 95% percentile of raw_data table has partition size of 107MB and max of 3.44GB.
Hope this helps to figure out performance issue.

- 440
- 5
- 23
-
Thanks but we require to key of particular partitions so that we can delete large partitions based on keys. – LetsNoSQL Dec 28 '18 at 11:42
-
Have you found the way to do this? I am also trying to figure out the key that has the largest partition. – developthou Nov 18 '20 at 14:33