0

Below is the data which I need to represent in my Cassandra Column Family.

rowKey           e1_name                    e1_schemaname                   e1_last_modified_date
123             (ByteArray value)           abc (some string)               some_date

userId is the rowKey here. And e1_name, e1_schemaname and e1_last_modified_date are the columns specifically for e1. All those three columns contain value specifically for e1 column. In the same way, I will be having for another columns such as below.

 e2_name                    e2_schemaname                   e2_last_modified_date
(ByteArray value)           abc (some string)               some_date

 e3_name                    e3_schemaname                   e3_last_modified_date
(ByteArray value)           abc (some string)               some_date

Now I am thinking- Is there any way to group these three things e1_name, e1_schemaname and e1_last_modified_date into a single column e1 and which will store the composite value together for all the three, instead of storing separately.

If yes, can anyone help me in designing the column family for this? I will be using Astyanax client to insert into above column family.

arsenal
  • 23,366
  • 85
  • 225
  • 331

1 Answers1

1

If I understood correctly, you can design the schema in following way:

columns:

user_id, record_type, name, schemaname, last_modified_date

user_id is your row key (partition key)

record_type or so (e1, e2) might be the cluster key in case e1, e2 are unique within single user. Otherwise your cluster key might be composite including field set for making unique keys

In first case PRIMARY_KEY(user_id, record_type)

          e1                          e2
123 : {name, schemaname, date }  | {name, schemaname, date }

The example for second could be PRIMARY_KEY(user_id, record_type, name)

          e1|name                    e2|name
123 : {schemaname, date }  | {schemaname, date }

Does it sounds ok?

Finally I'd say there is a very good official Datastax driver for cassandra native transport:

https://github.com/datastax/java-driver

viktortnk
  • 2,739
  • 1
  • 19
  • 18
  • In my example, I want to store column `e1` and inside column `e1` I want to store its value. And that value can be divided in three parts.. Actual column `e1` value, and string schemaname, and last modified date.. But in your example, I believe name is the value for e1 column... right? – arsenal Sep 21 '13 at 23:44
  • Yeah, I supposed it to be ByteArray value from your example. – viktortnk Sep 22 '13 at 10:42
  • Ok.. so the problem with this is - With every time, I get a new value for `e1` column, it will create a new column for that... And somehow, I need to keep track of deleting old columns related to `e1`.. – arsenal Sep 22 '13 at 18:05
  • you can make timestamp the part of the primary key then. PRIMARY_KEY(user_id, record_type, timeuuid). You can apply DESC column sort for timeuuid column and so the querying for actual e1 value (most recent) will be more efficient – viktortnk Sep 22 '13 at 22:32