I need to have your opinion on my design of data model for my project. My project is real-time recommendation system. There are some collections of recommendation algorithms. it means something like this:
collection1 { algorithm1, algorithm5, algorithm6 }
collection2 { algorithm5, algorithm6, algorithm7, algorithm8 }
etc.
and I need to store data like - success, probability of choosing, score and so on to every algorithm (every 2 minutes) in collection. So I have chosen Cassandra like my data storage because it works fine with time series. I need to have my data stored to be displayed later in some graphs and charts. Do you think that my solution of data model is ok? I made it in this way:
CREATE TABLE algorithm_by_collection_and_date (
algorithm_id text,
collection_id text,
date text,
event_time timestamp,
score double,
probability double,
PRIMARY KEY ((algorithm_id,collection_id,date),event_time)
);
So it is designed like row partitioning by adding data to the row key to limit the amount of columns (by date) I per algorithm in collection.
What do you think about this? Thanks, Jan