1

To anyone who uses the python stream-framework package, I would like to know your thoughts on using Cassandra. I am currently trying to build a notification feed based on Cassandra, that supports unread counts, and marking the whole feed as read. IT seems like the base stream-framework only has support for Redis for a NotificationFeed.

1) To the authors of the framework, can this be done using Cassandra?

2) For anyone else, here is a trimmed down model of the notification feed I am working on:

feed_id = columns.Ascii(primary_key=True, partition_key=True)
activity_id = columns.VarInt(primary_key=True, clustering_order='desc')
created_at = columns.DateTime(required=False)
group = columns.Ascii(required=False)
updated_at = columns.DateTime(required=False)
category_id = columns.Integer(required=False, index=True)
read_at = columns.DateTime(required=False)
seen_at = columns.DateTime(required=False)
read = columns.Boolean(required=False, index=True)
seen = columns.Boolean(required=False, index=True)

Each activity in the feed has a read and seen flag. For any individual activity, its easy enough to find it via its primary key (the specific feed, and the given activity ID) and therefore update the column. However, in cassandra 2.2+ there is no way to update the whole feed's worth of activities as read (since you have to provide the full primary key and cannot use secondary index). (NOTE: In cassandra 3.0 it seems like you can use the IN operator for the clustering key, so you might be able to do this in two steps: lookup activity_id where read=False using secondary index, then use the results via single query using IN to update them).

I hope this makes sense and if not, I will provide any clarifications needed.

mrquintopolous
  • 157
  • 3
  • 9

1 Answers1

3

Notification feeds with Cassandra are not bundled on stream-framework but can be implemented reusing existing base classes. To do that you need to implement the following classes:

  • BaseNotificationFeed
  • BaseListsStorage

and configure your ListsStorage implementation to be used by your notification feed (see here: https://github.com/tschellenbach/Stream-Framework/blob/aba914c71f527dcf43388937002075c851b47897/stream_framework/feeds/notification_feed/base.py#L11)

Regarding the implementation I have a few suggestions:

  • If you can, you should consider using the RedisListsStorage storage
  • Consider storing unread and unseen activity ids as static columns

For instance:

unread_ids set<text> static
unseen_ids set<text> static

Disclaimer: I am one the maintainer of stream-framework

Tommaso Barbugli
  • 11,781
  • 2
  • 42
  • 41
  • Thanks Tommaso! I just started reading about static columns and I was thinking about them regarding counts, but it makes sense that they would work to hold the list of unread / unseen activities. Then marking all as read is as simple as clearing the list. You said "consider using the RedisListsStorage" ... does that mean in conjunction with cassandra? That could work but would rather avoid having to have multiple dbs (I am still using redis but for ephemeral data). – mrquintopolous Jan 07 '16 at 17:44
  • yes I meant Redis for counters and Cassandra for feeds – Tommaso Barbugli Jan 07 '16 at 19:43