0

So I have a table that looks along the lines of:

CREATE TABLE schema.table (
  partitionkey int,
  clusterkey text,
  value bigint,
  PRIMARY KEY ((partitionkey), clusterkey)
);

We rely on this table to sort correctly by the clusterkey for pagination.
Problem is: when returning results back from cassandra, it looks like they are sorted based on their ASCII value rather than a logical A-Z sorting. - which makes sense programatically but not logically for a person viewing it.

So a query of:
SELECT clusterkey FROM schema.table WHERE partitionkey = 1 ORDER BY clusterkey ASC;
get the results on the left, when I expect something like the one on the right

---------(current)---                ----------(wanted)---
          clusterkey                           clusterkey 
---------------------                ---------------------
       Other capital                      a should be top 
     Starts capital1                        Other capital 
     Starts capital2                      Starts capital1 
     Starts capital3                      starts capital1 
     Starts capital4                      Starts capital2 
        YYYYYYYYYYYY                      Starts capital3
        ZZZZZZZZZZZZ                      Starts capital4 
     a should be top                         YYYYYYYYYYYY 
     starts capital1                         ZZZZZZZZZZZZ 
        zzzzzzzzzzzz                         zzzzzzzzzzzz

I understand that we could change the data to be all lowercase / UPPERCASE to sort properly but this would change the look of the data. - Which is definitely not wanted.

Is there an option to change the method of the current clustering order?
- or another way to sort logically?

ChiMo
  • 581
  • 2
  • 13
  • 32

2 Answers2

1

No, there is no way to get it sorted case-insensitive except making the clustering column lower/upper-cased. If you concerned about look of the data, you can always keep not modified text in the additional column (yes, it will duplicate the data), and show it to user, instead of this clustering column.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
1

Clustering columns are sorted in the lexicographic order which means uppercase characters have precedence over lowercase.

Case-insensitive ordering is not supported by Cassandra in current releases but there is an enhancement requested in Cassandra community Provide a locale/collation-aware text comparator so far it didn't get much attention from the developer community.

Gangadhar Kairi
  • 470
  • 4
  • 11