Cause
You are getting this error because your query doesn't have a filter on primary key columns:
InvalidRequest: Error from server: code=2200 [Invalid query] \
message="Cannot execute this query as it might involve data filtering and thus may have \
unpredictable performance. If you want to execute this query despite the performance \
unpredictability, use ALLOW FILTERING"
Neither the trip date nor the trip time are primary key columns for the tables so it is not possible to query using these columns.
Warning
The ALLOW FILTERING
clause enables filtering on non-primary key columns by performing a full table scan, querying every single partition on all nodes so it is very expensive and unpredictable.
The ALLOW FILTERING
clause is only recommended for use when the query is restricted to a single partition.
Workaround
In order to query against non-primary key columns, you need to index the columns. To illustrate with an example, here's my table which has the trip id
as the primary key:
CREATE TABLE stackoverflow.trips_by_id (
id text PRIMARY KEY,
tripdate date,
triptime int
)
If I want to run queries using either tripdate
or triptime
, I need to index these columns with:
CREATE CUSTOM INDEX tripdate_idx ON stackoverflow.trips_by_id (tripdate);
CREATE CUSTOM INDEX triptime_idx ON stackoverflow.trips_by_id (triptime);
Now that I have indexed them, I can execute queries like:
SELECT ... FROM trips_by_id
WHERE tripdate = ?
AND triptime = ?
WARNING: Be aware that indexing has its own issues so be aware of the pros and cons. Have a look at When to use and not use an index for details.
Solution
Cassandra is designed for high throughput, high velocity online transaction (OLTP) use cases where you are retrieving data one partition at a time (queries filtered by partition key).
In contrast, your query is analytics (OLAP) in nature because you are not reading just one partition -- you are scanning through the whole table. As such, the best way to run analytics queries is to use Apache Spark with the Spark Cassandra connector. Cheers!
Please support the Apache Cassandra community by hovering over the cassandra tag then click on the Watch tag
button. Thanks!