How to design cassandra schema so additional columns can be easily added later?

Question

I have defined table structure as defined below,

 CREATE TABLE sensor_data (
                asset_id text, 
                event_time timestamp,
                sensor_type int, 
                temperature int, 
                humidity int,
                voltage int,
                co2_percent int
                PRIMARY KEY(asset_id ,event_time)
            ) WITH CLUSTERING ORDER BY (event_time ASC)

this table captures data coming from a sensor and depending on type of sensor -- column sensor_type, some columns will have a value some others will not. Example temperature only applies to temperature sensor, humidity sensor applies to humidity sensor etc.

Now as I work with more and more sensor my intention is I will simply add additional columns using alter table command. Is this a correct strategy to follow or are there better ways to design this table for future use?

score 2 · Accepted Answer · edited May 23 '17 at 12:29

2

I've answered to a similar question few hours ago: here

Assuming you're Cassandra 2.X ready your situation is easier to handle, to perform what you need I'd use a Map

CREATE TABLE sensor_data (
  asset_id text, 
  event_time timestamp,
  sensor_type int, 
  sensor_info map<text, int>,
  PRIMARY KEY(asset_id ,event_time)
) WITH CLUSTERING ORDER BY (event_time ASC)

Advantages is that your schema will remain the same even if new sensors come into your world. Disadvantage is that you won't be able to retrieve a specific data from your collection, you will always retrieve the collection in its entirely. If you're in Cassandra 2.1 secondary indexes on collections might help.

HTH, Carlo

edited May 23 '17 at 12:29

Community

1
1

answered Aug 03 '14 at 06:27

Carlo Bertuccini

19,615
3
28
39

That certainly helps and make sense. However when I try to create a index on sensor_type using command - CREATE INDEX my_idx ON sensor_info_table KEY(sensor_type); I am getting error . – Subodh Nijsure Aug 03 '14 at 12:44
Indexing on collections is supported only from 2.1 -- what version are you using? – Carlo Bertuccini Aug 03 '14 at 12:45
I am using cassandra from package - apache-cassandra-2.1.0-rc3 – Subodh Nijsure Aug 03 '14 at 12:53
CREATE INDEX my_idx ON sensor_info_table (KEY(sensor_type)); This should work – Carlo Bertuccini Aug 03 '14 at 13:12
I tried that and I get error - CREATE INDEX my_idx ON sensor_info_table (KEY(sensor_type)); For it's workth my cqlsh shows version number cqlsh 5.0.1 if that makes difference. – Subodh Nijsure Aug 03 '14 at 13:29
Sorry, my mistake ... KEYS, not KEY – Carlo Bertuccini Aug 03 '14 at 13:39
Yup that works, thank you so much Carlo, and my apologies for not looking into syntax for create index myself. – Subodh Nijsure Aug 03 '14 at 13:50

How to design cassandra schema so additional columns can be easily added later?

1 Answers1