I am storing historical timeseries data into Cassandra.
cassandra@cqlsh>CREATE TABLE data."InstrumentTimeSeries" (
key blob,
column1 bigint,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND bloom_filter_fp_chance = 0.01
AND comment = ''
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE'
AND caching = {
'keys' : 'ALL',
'rows_per_partition' : 'NONE'
}
AND compression = {
'sstable_compression' : ''
}
AND compaction = {
'class' : 'SizeTieredCompactionStrategy'
};
In column 'value' I am storing whole json after serializing that. Now I want to fetch records on base of some keys from json stored in value column.
- I can create a duplicate table by adding new clustering columns as per requirements. or
- I can think of Secondary INDEX
I have 20+ million of records in table.
What would be the best way to do that?