0

Assuming all people records are identified by a UUID and all groups are identified by a UUID. What data model would you create when you need to commonly query the list of all people in a group, and the list of all groups a person belongs to. i.e.

create table membership (
    person_uuid uuid,
    group_uuid uuid,
    joined bigint,
    primary key (person_uuid, group_uuid));

The above would optimise for querying by person, and the below would optimise for querying by group.

create table membership (
    group_uuid uuid,
    person_uuid uuid,
    joined bigint,
    primary key (group_uuid, person_uuid));

Is there a neat way to handle so you can optimally query by person_uuid and by group_uuid without having to use "allow filtering", i.e.:

select group_uuid from membership where person_uuid=?
select person_uuid from membership where group_uuid=? allow filtering

Do you just go ahead and store two copies of the data for queries in both directions, this has atomicity issues though right?

Jay
  • 19,649
  • 38
  • 121
  • 184

1 Answers1

0

@Jacob

What you can do is create secondary index on the second clustering component of primary key to be able to query on it.

create table membership (
    person_uuid uuid,
    group_uuid uuid,
    joined bigint,
    primary key (person_uuid, group_uuid));

create index on membership(group_uuid);

Of course then you'll need to add allow filtering to the query but it will be much faster than without index.

If you choose to use 2 tables index data without using secondary index, you could use atomic batch when inserting data to guarantee atomicity

doanduyhai
  • 8,712
  • 27
  • 26