If I have a single partition with 100'000 deleted rows in one cluster followed by a second cluster in the same partition with no deleted rows, will the performance of doing a SELECT * FROM example_table WHERE partition=that_partition AND cluster=the_second_cluster
be affected by the tombstones present in the_first_cluster?
I'm expecting that if the retrieval of row sets with a where clause is constant then Cassandra will just jump past all of the tombstones to the second cluster, but I don't understand how the where clause finds the correct row so I don't know if this is the case and I didn't manage to find anything online that could enlighten me.
// Example table
CREATE TABLE example_table (
partition TEXT,
cluster TEXT,
value BLOB,
PRIMARY KEY (partition, cluster);
// Example layout of rows in a table
partition |cluster |value
that_partition |the_first_cluster |some_value1 // Deleted, a tombstone
that_partition |the_first_cluster |some_value2 // Deleted, a tombstone
... 99'997 more similar tombstone rows
that_partition |the_first_cluster |some_value // Deleted, a tombstone
that_partition |the_second_cluster |some_valueA // Not a tombstone
that_partition |the_second_cluster |some_valueB // Not a tombstone
... no tombstones in the_second_cluster