I am working on designing the Cassandra Column Family schema for my below use case.. I am not sure what is the best way to design the cassandra column family for my below use case? I will be using CQL Datastax Java driver for this..
Below is my use case and the sample schema that I have designed for now -
SCHEMA_ID RECORD_NAME SCHEMA_VALUE TIMESTAMP
1 ABC some value t1
2 ABC some_other_value t2
3 DEF some value again t3
4 DEF some other value t4
5 GHI some new value t5
6 IOP some values again t6
Now what I will be looking from the above table is something like this -
- For the first time whenever my application is running, I will ask for everything from the above table.. Meaning give me everything from the above table..
- Then every 5 or 10 minutes, my background thread will be checking this table and will ask for give me everything that has changed only (full row if anything got changed for that row).. so that is the reason I am using timestamp as one of the column here..
But I am not sure how to design the query pattern in such a way such that both of my use cases gets satisfied easily and what will be the proper way of designing the table for this? Here SCHEMA_ID will be primary key I am thinking to use...
I will be using CQL and Datastax Java driver for this..
Update:-
If I am using something like this, then is there any problem with this approach?
CREATE TABLE TEST (SCHEMA_ID TEXT, RECORD_NAME TEXT, SCHEMA_VALUE TEXT, LAST_MODIFIED_DATE TIMESTAMP, PRIMARY KEY (ID));
INSERT INTO TEST (SCHEMA_ID, RECORD_NAME, SCHEMA_VALUE, LAST_MODIFIED_DATE) VALUES ('1', 't26', 'SOME_VALUE', 1382655211694);
Because, in my this use case, I don't want anybody to insert same SCHEMA_ID
everytime.. SCHEMA_ID
should be unique whenever we are inserting any new row into this table.. So with your example (@omnibear), it might be possible, somebody can insert same SCHEMA_ID twice? Am I correct?
And also regarding type
you have taken as an extra column, that type column can be record_name
in my example..