As others have already pointed out Cassandra does not support filtering while skipping parts of the clustering key. And while it is tempting to consider this as a limitation, it is helpful to take a deeper look into why this restriction exists.
First of all, the ALLOW FILTERING
clause already puts stress on all of the Cassandra nodes in a cluster. As the query does not specify the partition key, each of the nodes would have to process it by loading data from a disk and discarding records that don't match provided criteria. But as far as I understand, due to the way data is stored by Cassandra in files, it can load only its subset based on the clustering key provided in a query. However, only if either all of the components of the clustering key are specified, or only one or more of the last ones are omitted.
If a query "skips" parts of the clustering key, like in your example, each node would have to load pretty much everything from a file system and sequentially look for a match. You could imagine the consequences, even if the actual number of records matched by the filter is negligible.
This post explains in more details the impact of ALLOW FILTERING
while this one dives deeper into SQL WHERE
clause in general.
Possible Solution
I'm sure that knowing about this restriction does not solve your problem of being able to query by c
component of the partition key. As far as I could tell, revising of the data model would usually provide a better solution.
If you find yourself looking for data by c
often, add one more table, where c
would become a partition key. Not only you'll get all of the benefits of caching and limited data loading, but also you would limit your query to only one node. The improvements in the execution time often overweight any savings in the disk space that you might get from trying to tailor a filtering query.