1

Is the performance impacted if I provide only the partition key while querying a table containing both partition key and clustering key?

For example, for a table with partition key p1 and clustering key c1, would

SELECT * FROM table1 where p1 = 'abc';

be less efficient than

SELECT * FROM table1 where p1 = 'abc' and c1 >= 'some range start value' and c1 <= 'some range end value';

My goal is to fetch all rows with p1 = 'abc'.

tourniquet_grab
  • 792
  • 9
  • 14
  • 1
    It depends how wide is the row for the given partition key. If 'abc' partition key has 1000 rows it has to fetch all the rows into memory and driver node , etc. However, if your goal is to fetch all the rows why have a predicate on the clustering column after all it will be unnecessary. – Praneeth Gudumasu Jul 30 '18 at 11:50
  • 1
    Just to clarify, it will only read a number of rows equal to the fetch size though (can be set per query). It doesn't bring entire partition into memory. Clustering key can still be useful in ordering within a partition, even if reading it all. – Chris Lohfink Jul 30 '18 at 14:50
  • @PraneethGudumasu You are right. I don't need a predicate on the clustering column. However, I was wondering if it improved performance if the start and end of the clustering key range were provided. – tourniquet_grab Jul 31 '18 at 14:25

1 Answers1

1

Main cost in going to particular row vs a particular partition is that theres an extra work and necessity of deserializing the clustering key index at the beginning of the partition. Its a bit old and based on thrift but the gist of it remains true in the following:

http://thelastpickle.com/blog/2011/07/04/Cassandra-Query-Plans.html (note: row level bloom filter was removed)

When reading from a beginning of a partition you can save a little work which will improve the latency.

I wouldn't worry too much about it as long as your queries are not spanning multiple partitions. Then you will generally only have issues if the partitions get to be hundreds of mb or gb's in size.

Chris Lohfink
  • 16,150
  • 1
  • 29
  • 38