0

This is the query I used to create the table:

CREATE TABLE test.comments (msguuid timeuuid, page text, userid text, username text, msg text, timestamp int, PRIMARY KEY (msguuid));

I want to get the last 50 rows sorted by timestamp in descending order.

If I try something like: SELECT * FROM test.comments WHERE page = 'test' AND timestamp < 1496468332, I get this error:

Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

I'd prefer not to use allow filtering, I want the query to be as quick as possible.

I've looked at another stackoverflow question here Cassandra cql: how to select the LAST n rows from a table and tried the solution:

CREATE TABLE test.comments (msguuid timeuuid, page text, userid text, username text, msg text, timestamp int, PRIMARY KEY (msguuid)) WITH CLUSTERING ORDER BY (msguuid DESC);

But then I get this error: InvalidRequest: Error from server: code=2200 [Invalid query] message="Only clustering key columns can be defined in CLUSTERING ORDER directive"

I'm new to Cassandra, so forgive me if this has an obvious answer. I just can't seem to get it to work.

I would highly appreciate if anyone could help me.

J Del
  • 831
  • 2
  • 15
  • 31

2 Answers2

1

Instead of using index create a Materialized View

Create a materialized view with page as partition key and msguuid as clustering key order by desc.

CREATE MATERIALIZED VIEW test.comments_by_page AS
    SELECT *
    FROM test.comments
    WHERE page IS NOT NULL AND msguuid IS NOT NULL
    PRIMARY KEY (page, msguuid)
    WITH CLUSTERING ORDER BY (msguuid DESC);

Though you are using msguuid as current timestamp's timeuuid, you data will sorted by time desc.

To get last 50 row of a page use the below query :

SELECT * FROM comments_by_page WHERE page = 'test' LIMIT 50;

Check this link to know the performance of Materialized View over Index and when not to use : http://www.datastax.com/dev/blog/materialized-view-performance-in-cassandra-3-x

Ashraful Islam
  • 12,470
  • 3
  • 32
  • 53
1

In cassandra world try to model your table based on the queries it needs to satisfy. If the query always goes by where clause "page" and msguuid is only there for uniqueness redesign the table to something like as follows

CREATE TABLE test.comments (msguuid timeuuid, page text, userid text, username text, msg text, timestamp int, PRIMARY KEY (page, msguuid), WITH CLUSTERING ORDER BY (msguuid DESC));

Now the table is naturally order by msguuid and there isn't any requirement to have additional overhead of Materialized View.

dilsingi
  • 2,938
  • 14
  • 24
  • 1
    Thanks, I'm glad that there's a way to do it without the overhead. I marked yours as the answer instead. – J Del Jun 04 '17 at 03:49
  • @dilsingi My Solution will not not generate tombstone. MV only generate tombstone If the MV's primary key `page` and `msguuid` updated on the base table. For his case there is no chance to to update the value of `page` because it's his primary key and also `msguuid` will not be updated because it's current timestamp timeuuid. Learn More before advice anyone https://opencredo.com/everything-need-know-cassandra-materialized-views/ – Ashraful Islam Jun 04 '17 at 04:08
  • @AshrafulIslam As per the original table definition, it never had "page" as primary key. Please check. So it could have been updated and the MV would result in tombstones. – dilsingi Jun 04 '17 at 06:44
  • In that case what about your (@dilsingi) design ? If a page updated then all the msguuid and associated data of that page need to be deleted and re insert again – Ashraful Islam Jun 04 '17 at 06:49
  • @AshrafulIslam Its true that for any table in Cassandra that primary keys can't be modified. But why create a MV when the original table could satisfy the use case. So model the table around the queries and not vice-versa. – dilsingi Jun 04 '17 at 06:56
  • @dilsingi J Del show his main table, i though he don't want to change his main table. This is not the case, Why you mention my answer will generate tombstone ? – Ashraful Islam Jun 04 '17 at 07:48
  • It's true that I didn't have `page` set as a primary key in the original question. It should have been a primary key. I don't plan on ever updating `page` in this case. I apologise for the error. I've remarked `Ashraful Islam`'s answer as the solution. – J Del Jun 04 '17 at 21:44
  • @JDel Have removed my remarks around the tombstones in MV, as the original table itself should have the PAGE as part of the primary key. Still MV is an overhead, when the actual redesigned table alone can itself serve the purpose. – dilsingi Jun 05 '17 at 18:13