7

If I define a table like this using cql:

CREATE TABLE scores (
 name text,
 age int,
 score int,
 date timestamp,
 PRIMARY KEY (name, age, score)
);

And do a SELECT in cqlsh like this:

select * from mykeyspace.scores;

The result displayed seems to always be sorted by 'age', then 'score' automatically in ascending order regardless of input-data ordering (as expected, return rows are not sorted by the partition key 'name'). I have the following questions:

  1. Does SELECT automatically sort return rows by the clustering keys?
  2. If yes, what's the purpose of using the ORDER BY clause when using SELECT?
  3. If no, how do I get the return rows to sort by the clustering keys since cql doesn't allow ORDER BY on a select *?
Anower Perves
  • 754
  • 8
  • 15
ptmoy2
  • 311
  • 1
  • 4
  • 13

1 Answers1

10

Your clustering columns define the order (in your case age then score)

http://cassandra.apache.org/doc/cql3/CQL.html#createTableStmt

On a given physical node, rows for a given partition key are stored in the order induced by the clustering columns, making the retrieval of rows in that clustering order particularly efficient (see SELECT).

http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt

The ORDER BY option allows to select the order of the returned results. It takes as argument a list of column names along with the order for the column (ASC for ascendant and DESC for descendant, omitting the order being equivalent to ASC). Currently the possible orderings are limited (which depends on the table CLUSTERING ORDER):

  • if the table has been defined without any specific CLUSTERING ORDER, then the allowed orderings are the order induced by the clustering columns and the reverse of that one.
  • otherwise, the orderings allowed are the order of the CLUSTERING ORDER option and the reversed one.
Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
Mikhail Stepura
  • 3,374
  • 20
  • 16
  • 2
    Mikhail, I already read those documents. They don't really answer my questions. Given how I've defined my table, is the sorting automatic? If yes, why would I still need the ORDER BY clause in the SELECT statement (I guess I would still need it if I needed to sort in a descending order)? – ptmoy2 Feb 22 '14 at 06:27
  • 2
    Yes, the sorting is "automatic". Your data is physically stored in the sorted order, defined either by our clustering columns, or y CLUSTERING ORDER (if present). The ORDER BY allows you to switch between ASC and DESC – Mikhail Stepura Feb 22 '14 at 18:17
  • 1
    Thanks Mikhail. So since the data is sorted in ascending order automatically already, there's really no reason to use ORDER BY in the SELECT unless I want to sort in descending order - correct? – ptmoy2 Feb 23 '14 at 04:15
  • Data is stored in the order you defined, and using ORDER BY you only can reverse that order – Mikhail Stepura Feb 23 '14 at 05:01