Is ALLOW FILTERING in Cassandra for following query efficient?

Question

I have a table like this:

CREATE TABLE IF NOT EXISTS Posts (
    idObject int,
    objectType text,
    idParent uuid,
    id uuid,
    idResolution uuid,
    PRIMARY KEY ((idObject, objectType, idParent), id)
);

Now have a look at the following query:

SELECT * FROM POSTS WHERE idobject = 1 AND objectType = 'COURSE' AND idParent = 00000000-0000-0000-0000-000000000000 AND idResolution = 00000000-0000-0000-0000-000000000000 ALLOW FILTERING

Now the Partition Key is completely known, so if I use ALLOW FILTERING is there going to be any performance issue because the filtering is going to be done in a known single partition?

Adding secondary index for idResolution column can be a good alternative for ALLOW FILTERING from performance point of view. — Mikhail Baksheev, Dec 28 '16 at 15:32

score 2 · Accepted Answer · answered Dec 28 '16 at 14:49

2

It depends on how many rows are in that particular partition, and if they are spread across multiple SSTable files. But like you said, this query is guaranteed to be limited to a single node, so it might be ok.

I'd test it out with cassandra-stress, just to be sure. That way you'll know if the query latency is acceptable to your application.

answered Dec 28 '16 at 14:49

Aaron

55,518
11
116
132

Thanks @Aaron. Can you comment on the fact that this query will be limited on a single node? According to me it should because complete Partition Key is constrained and known. – Ashutosh Dec 29 '16 at 05:57

score 2 · Answer 2 · answered Dec 28 '16 at 20:52

2

For a large partition, you are probably better off using the DataStax driver paging API. https://docs.datastax.com/en/developer/java-driver/2.1/manual/paging/

A huge partition could have some application related issues with the unbounded size you are requesting. Be safe and page on.

answered Dec 28 '16 at 20:52

Patrick McFadin

1,341
8
10

Thanks Patrick! So if I use Paging, then performance of this query will be fine? – Ashutosh Dec 29 '16 at 05:59
The performance will be related to the page size you fetch and not the entire partition size. Given that, you should see much better performance. So that means you should pay close attention to your fetch size as a perf tuning consideration. – Patrick McFadin Jan 27 '17 at 21:07

Is ALLOW FILTERING in Cassandra for following query efficient?

2 Answers2

Linked