How select elements for every userId from table

Question

I have create table user in cassandra

 create table users (pcId int , userId int, friendId int,  age int, score int , name text, PRIMARY KEY (pcId, userId, friendId, score))

and I insert data:

 INSERT INTO users(pcid , userid , score , friendid ,  age ,  name) Values (1, 1, 2, 1, 12, 'l');
 INSERT INTO users(pcid , userid , score , friendid ,  age ,  name) Values (1, 1, 2, 2, 12, 'a');
 INSERT INTO users(pcid , userid , score , friendid ,  age ,  name) Values (1, 1, 2, 0, 15, 'p');
 INSERT INTO users(pcid , userid , score , friendid ,  age ,  name) Values (1, 2, 6, 1, 15, 'p');
 INSERT INTO users(pcid , userid , score , friendid ,  age ,  name) Values (1, 2, 7, 2, 15, 'p');

1 |      1 |        0 |     2 |  15 |    p
1 |      1 |        1 |     2 |  12 |    l
1 |      1 |        2 |     2 |  12 |    a
1 |      2 |        1 |     6 |  15 |    p
1 |      2 |        2 |     7 |  15 |    p

My question is: How can I select for every userid 2 friends(friendid, age, name, score) sorted by name?

My result should be:

1 |      1 |        2 |     2 |  12 |    a
1 |      1 |        1 |     2 |  12 |    l
1 |      2 |        1 |     6 |  15 |    p
1 |      2 |        2 |     7 |  15 |    p

add name in the primary key after userid, cassandra by default sort by clustering key column — Ashraful Islam, Dec 01 '16 at 18:09
@AshrafulIslam I can not add name in the primary Key because is an updatable field — kaio, Dec 02 '16 at 12:33

score 1 · Answer 1 · answered Dec 01 '16 at 18:46

1

If you are using Cassandra 3.6+ you can use PER PARTITION LIMIT 2 on your query:

SELECT * FROM users PER PARTITION LIMIT 2;

and it will give you the first two rows of each partition in the order specified by your clustering key inside your partition. You'll probably need to sort by your partition key if it is important for you.

That means you need to change your partition key to be your userid of course.

Here's the reference to the SELECT page.

answered Dec 01 '16 at 18:46

xmas79

5,060
2
14
35

I must have users by pcid in the same node so the partition Key must be pcid not userid. @xmas79 – kaio Dec 02 '16 at 12:36
@kaio I don't understand your requirement, and on the contrary, you should avoid that behavior in general... BTW, by leaving `pcid` as your only partition key you will be able to query by the remaining clustering key, in exactly the order you specified in the table definition **only **: `userId`, `friendId`, `score`. No way to get what you want from this table without recurring to application code. You can easily obtain what you want by adding my table definition to yours, denormalizing your data. You'll need to take care of populating/updating both tables. – xmas79 Dec 02 '16 at 14:40

Ashraful Islam · Answer 2 · 2016-12-03T17:42:07.703

You can use Materialized View
It's introduced in cassandra 3.0

In Cassandra 3.0 and later, a materialized view is a table that is built from another table's data with a new primary key and new properties. In Cassandra, queries are optimized by primary key definition. Standard practice is to create the table for the query, and create a new table if a different query is needed. Until Cassandra 3.0, these additional tables had to be updated manually in the client application. A materialized view automatically receives the updates from its source table.

Create a materialized view like this one.

CREATE MATERIALIZED VIEW users_friend AS
    SELECT *
    FROM test.users
    WHERE name IS NOT NULL AND 
    pcid IS NOT NULL AND 
    userid IS NOT NULL AND
    friendid IS NOT NULL AND 
    score IS NOT NULL
    PRIMARY KEY (pcid, userid, name, friendid, score)
    WITH CLUSTERING ORDER BY (userid ASC, name ASC, friendid ASC, score ASC)

Now if you insert data on users table, cassandra will copy data to users_friend
Now if you want friend list with order by their name use below query :

SELECT * FROM users_friend WHERE pcid = 1 and userid = 1

You will get the below output :

 pcid | userid | name | friendid | score | age
------+--------+------+----------+-------+-----
    1 |      1 |    a |        2 |     2 |  12
    1 |      1 |    l |        1 |     2 |  12
    1 |      1 |    p |        0 |     2 |  15

If you update userid = 1, friendid = 2 name

UPDATE users SET name = 'm' WHERE pcid = 1 AND userid = 1 and friendid = 2 and score = 2

You will get

 pcid | userid | name | friendid | score | age
------+--------+------+----------+-------+-----
    1 |      1 |    l |        1 |     2 |  12
    1 |      1 |    m |        2 |     2 |  12
    1 |      1 |    p |        0 |     2 |  15

Source : https://docs.datastax.com/en/cql/3.3/cql/cql_using/useCreateMV.html
More : https://cassandra-zone.com/materialized-views/

ok with materialized view we can order by name but the problem persist a gain because the sort must be for every userid have the list of friend — kaio, Dec 06 '16 at 08:20

How select elements for every userId from table

2 Answers2