Say I have the following table and secondary indices defined:
CREATE TABLE ItemUpdates (
time timestamp,
item_name text,
item_context text,
item_descr text,
tags map<text, int>,
PRIMARY KEY ((time, item_name, item_context))
);
CREATE INDEX ItemUpdateByName
ON ItemUpdates(item_name);
CREATE INDEX ItemUpdateByContext
ON ItemUpdates(item_context);
CREATE INDEX ItemUpdateByTag
ON ItemUpdates(KEYS(tags));
General background information on the data model: an item has a unique name within a context, so (item_name, item_context) is a natural key for items. Tags have some value associated with them.
A natural query in my application is "show me all updates on item X with a certain tag". This translates to:
SELECT * FROM ItemUpdates
WHERE item_name='x'
AND item_context='a'
AND tags CONTAINS KEY 't';
When I try some queries I notice that although the cluster uses the Murmur3Partitioner, the results come ordered by time. This makes sense when you consider that Cassandra stores secondary indices as wide rows, and that colums are ordered by their name.
(1) Does Cassandra always return rows sorted by partition key when selecting on a(n) (set of) indexed column(s)?
The reason I find this interesting is that other natural queries in my application include:
- fetch all updates on item X, since date D
- fetch the 300 most recent updates on item X
What surprises me is that adding a clause ORDER BY time DESC
to my select statement on ItemUpdates results in an error message "ORDER BY with 2ndary indexes is not supported."
(2) (How) can I do a range query on the partition key when I narrow the query by selecting on an indexed column?