0

Yes, so I've been researching for some time and found out it is not uncommon for people to have problems with ordering data in Cassandra, but I still can't figure out why my selects are not being ordered in the right way. So here is my table creation query:

CREATE TABLE library.query1 (
    id int,
    gender text,
    surname text,
    email text,
    addinfo text,
    endid int,
    name text,
    phone int,
    PRIMARY KEY ((id), gender, surname, email)
) WITH CLUSTERING ORDER BY (gender DESC, surname DESC, email DESC);

As implicit, I want to order my data by gender > surname > email.

I then import data via CVN, as I'm importing data from PostgreSQL tables. Here's the SELECT I'm using:

SELECT id, gender, name, surname, phone, email
FROM library.query1;

Is there something I'm forgetting in the query for the ordering to be done, or is my modeling wrong?

Mateus Wolkmer
  • 706
  • 4
  • 26
  • Your data will be ordered only inside your partition. In your case the partition is id. – Alex Tbk Jun 24 '18 at 06:55
  • Can I include the other columns in the partition? I've tried setting the primary key to (id, gender, surname, email) but it also didn't seem to work. Also, found out I may had to include the 'id' in some WHERE clause for the columns to be ordered but that also failed. – Mateus Wolkmer Jun 24 '18 at 07:07

1 Answers1

1

You could create a partition for male users for example. Then your ordering should work fine.

CREATE TABLE library.query1 (
    id int,
    gender text,
    surname text,
    email text,
    addinfo text,
    endid int,
    name text,
    phone int,
    PRIMARY KEY (gender, surname, email)
) WITH CLUSTERING ORDER BY (surname DESC, email DESC);
Alex Tbk
  • 2,042
  • 2
  • 20
  • 38
  • Yes, it actually worked. Can you explain why when partitioning by 'gender' every column comes ordered as desired, but when 'id' was the partition key nothing came ordered at all? – Mateus Wolkmer Jun 24 '18 at 07:21
  • 1
    When using id as clustering column there is a partition created for each id. Inside this partition there is only one entry, as each id is different. Thats why you cant sort on id – Alex Tbk Jun 24 '18 at 07:58
  • Oh, I totally see it now. Thank you! – Mateus Wolkmer Jun 24 '18 at 08:02
  • This is not very good design - you'll have too big partitions because you're limiting `gender` to very few possible values. – Alex Ott Jun 24 '18 at 09:12
  • This is correct, my answer is meant as an example, so he can understand sorting on partitions. If we are talking about a lot of users, you could introduce additional clustering columns like registerWeek or something like that – Alex Tbk Jun 24 '18 at 09:24