As jny indicated, data duplication, denormalization and query-based modeling are keys to building good Cassandra data models. If I wanted to take your tables above, and build a table to support address/contact queries based-on country, I could do it like this:
First, I'll create a user defined type for the contact's address.
aploetz@cqlsh:stackoverflow> CREATE TYPE contactAddress (
... street varchar,
... city varchar,
... zip_code varchar,
... country varchar);
Next, I'll create a table called UserContactsByCountry
to store user contact info, as well as any user contact addresses:
aploetz@cqlsh:stackoverflow> CREATE TABLE UserContactsByCountry (
... country varchar,
... user_id uuid,
... type varchar,
... email varchar,
... firstname varchar,
... lastname varchar,
... photo blob,
... imagelength int,
... note varchar,
... addresses map<text, frozen <contactAddress>>,
... PRIMARY KEY ((country),user_id));
A couple of things to note here:
- I am using
country
as a partitioning key for querying, and addding user_id
as a clustering key for uniqueness.
- Technically,
country
is being stored multiple in each row. Once as the partiiton key, and again with each address. Note that the country
partition key is the one which allows us to run our query.
- I assume that user contacts can have multiple addresses, so I'll store them in a map of type text (varchar), contactAddress (type created above).
Next, I'll insert three user contacts, each with two addresses, two from the USA and one from Great Britain.
aploetz@cqlsh:stackoverflow> INSERT INTO usercontactsbycountry (country, user_id, type, email, firstname, lastname, note, addresses)
VALUES ('USA',uuid(),'Tech','brycelynch@network23.com','Bryce','Lynch','Head of R&D at Network 23',{'work':{street:'101 Big Network Drive',city:'New York',zip_code:'10023',country:'USA'},'home':{street:'8192 N. 42nd St.',city:'New York',zip_code:'10025',country:'USA'}});
aploetz@cqlsh:stackoverflow> INSERT INTO usercontactsbycountry (country, user_id, type, email, firstname, lastname, note, addresses)
VALUES ('USA',uuid(),'Reporter','edisoncarter@network23.com','Edison','Carter','Reporter at Network 23',{'work':{street:'101 Big Network Drive',city:'New York',zip_code:'10023',country:'USA'},'home':{street:'76534 N. 62nd St.',city:'New York',zip_code:'10024',country:'USA'}});
aploetz@cqlsh:stackoverflow> INSERT INTO usercontactsbycountry (country, user_id, type, email, firstname, lastname, note, addresses)
VALUES ('GBR',uuid(),'Reporter','theorajones@network23.com','Theora','Jones','Controller at Network 23',{'work':{street:'101 Big Network Drive',city:'New York',zip_code:'10023',country:'USA'},'home':{street:'821 Wembley St.',city:'London',zip_code:'W11 2BQ',country:'GBR'}});
Now I can query that table for all user contacts in the USA:
aploetz@cqlsh:stackoverflow> SELECT * FROM usercontactsbycountry WHERE country ='USA';
country | user_id | addresses | email | firstname | imagelength | lastname | note | photo | type
---------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+-------------+----------+---------------------------+-------+----------
USA | 2dee94e2-4887-4988-8cf5-9aee5fd0ea1e | {'home': {street: '8192 N. 42nd St.', city: 'New York', zip_code: '10025', country: 'USA'}, 'work': {street: '101 Big Network Drive', city: 'New York', zip_code: '10023', country: 'USA'}} | brycelynch@network23.com | Bryce | null | Lynch | Head of R&D at Network 23 | null | Tech
USA | b92612dd-dbaa-42f2-8ff2-d36b6c601aeb | {'home': {street: '76534 N. 62nd St.', city: 'New York', zip_code: '10024', country: 'USA'}, 'work': {street: '101 Big Network Drive', city: 'New York', zip_code: '10023', country: 'USA'}} | edisoncarter@network23.com | Edison | null | Carter | Reporter at Network 23 | null | Reporter
(2 rows)
There are probably other ways in which this could be modeled, but this is one that I hoped to use to help you understand some of the techniques available.