I'm currently using and researching about data modeling practices in cassandra. So far, I get that you need have a data modeling based on the queries executed. However, multiple select
requirements make data modeling even harder or impossible to handle it on 1 table. So, when you can't handle these requirements on 1 table, you need to insert 2-3 tables. In other words, you need to make multiple inserts on 1 operation.
Currently, I'm dealing with a data model of a campaign structure. I have a campaign table on cassandra with the following cql;
CREATE TABLE campaign_users
(
created_at timeuuid,
campaign_id int,
uid bigint,
updated_at timestamp,
PRIMARY KEY (campaign_id, uid),
INDEX(campaign_id, created_at)
);
In this model, I need to be able to make incremental exports given a timestamp only. In cassandra, there is allow filtering
mode that enables select
queries for secondary indexes. So, my cql statement for incremental export is the following;
select campaign_id, uid
from campaign_users
where created_at > minTimeuuid('2013-08-14 12:26:06+0000') allow filtering;
However, if allow filtering is used, there is a warning saying that the statement have unpredictable performance. So, is it a good practice relying on allow filtering
? What can be other alternatives ?