1

I am trying to model time series data with many sensors (> 50k) with cassandra. As I would like to do filtering on multiple sensors at the same time, I thought using the following (wide row) schema might be suitable:

CREATE TABLE data(
 time timestamp, 
 session_id int, 
 sensor text, 
 value float, 
 PRIMARY KEY((time, session_id), sensor)
);

If every sensor value was a column in an RDBMS, my query would ideally look like:

SELECT * FROM data WHERE sensor_1 > 10 AND sensor_2 < 2;

Translated to my cassandra schema, I assumed the query might look like:

SELECT * FROM data 
WHERE 
  sensor = 'sensor_1' AND 
  value > 10 AND 
  sensor = 'sensor_2' AND 
  value < 2;

I now have two problems:

  1. cassandra tells me that I can filter on the sensor column only once:

sensor cannot be restricted by more than one relation if it includes an Equal

  1. Obviously, the filter on value doesn't make sense at the moment. I wouldn't know how to express the relationship between sensor and value in the query in order to filter multiple columns in the same (wide) row.

I do know that a solution to the first question would be to use CQL's IN clause. This however doesn't solve the second problem.

Is this scenario even suitable for cassandra?

Many thanks in advance.

Community
  • 1
  • 1
Calardan
  • 127
  • 1
  • 10

1 Answers1

1

You could try to use IN clause here. So your query would be like this:

SELECT * FROM data 
WHERE time = <time> and session_id = <session id> 
    AND sensor IN ('sensor_1', 'sensor_2') 
    AND value > 10 AND value < 2
Arthur Landim
  • 384
  • 2
  • 9
  • But how would i make sure that the value for 'sensor_1' would be greater than 10 and the value for 'sensor_2' would be less than 2? It seems to me that actually any of the 50k sensor values could match that criteria. – Calardan Aug 09 '17 at 08:11
  • I understand your question now. You have to design your table with an select driven approach. If you share what information you want to get we can suggest a more suitable table design for you. – Arthur Landim Aug 10 '17 at 13:25
  • I want to find all sessions in which sensors have specific values at the same point in time. For example, I want to find the sessions in which the temperature was above 10 degrees and the humidity was less than 40% at the same time. – Calardan Aug 11 '17 at 13:47