1

As we know from Cassandra's documentation[Link to doc] that partitioner should be such that the data is distributed evenly across multiple nodes to avoid read hotspots. Cassandra offers various partitioning algorithms for that - Murmur3Partitioner, RandomPartitioner, ByteOrderedPartitioner .

Murmur3Partitioner is the default Partitioning Algorithm set by Cassandra. It hashes the partition key and converts into the hash values ranges from -2^63 to +2^63-1. My query here is, we have different data sets which has different partition key. For example, one can set partition key with uuid type data, other can set first name and last name as partitioning key, other can set timestamp as their partitioning key and one can also set city name in partitioning key.

Now assume a data set with city as partitioning key, let's say

Node 1 stores Houston data

Node 2 stores Chicago data

Node 3 stories Phoenix data and so on...

And our data gets more entries of data with Chicago city at one instant of time, then Node 2 will have maximum records of our database and there will be hotspots in that case. In this scenario how will Cassandra manage to evenly distribute data across these nodes?

khush
  • 2,702
  • 2
  • 16
  • 35

1 Answers1

4

In short - it doesn't. It is a deterministic hash function with the partitioner, so the same value will result in the same hash value each time and position on the ring. If you design a data model where 80% of the data has the same partition key, then 80% of the data will sit on 3 nodes (assuming RF 3).

Using partition keys with a high cardinality prevent this by the fact that they will hash to so many different values and locations in the ring. Using a partition key value such as city, which is a relatively low cardinality value, is not a good partition key in any scenario beyond a very small dataset.

The onus is on the developer to design a data model which uses suitable high cardinality values for the partition key on the larger data sets to avoid hotspots.

Andrew
  • 26,629
  • 5
  • 63
  • 86
  • What will happen if, after adding data, it is identified that the current partitioning key is not giving good performance? The current data model is creating hotspots and not giving high read throughputs. Do we need to redesign the whole model again and move records from one node to another, and won't there be a significant downtime required to handle such bad data model design? – khush Jan 17 '22 at 14:13
  • Ideally, identify this problem in advance. In terms of a new data model, you can potentially double write to both the old and new table, while migrating the historic data in the background - and only read from the old table until that data migration has completed. – Andrew Jan 17 '22 at 14:19