1

For example user can subscribe on specific categories of films.

When a new films apperas in Kafka I must to send information about that to consumers who subscribes on category of film.

How to classifier this? Using partigion or topics? Because categories can be more over 1000.

1 Answers1

1

Simply create a topic user-subscriptions that holds at least the userID and the categoryID. For each subscription, make sure to create a different record even for the same user. For example,

{"subscriptionID": 1, "userID": 1, "categoryID": 100}
{"subscriptionID": 2, "userID": 1, "categoryID": 26}
...

Now make sure to partition this topic by categoryID so that the records for the same category are placed in the same partition.

Now once your application identifies a new film of a particular category, you just need to go through the records of the partition that holds all the subscriptions for that category and while consuming, notify the users for the new film using the userID.

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
  • Do you mean topic `user-subscriptions` and inside topic partitions with specific key `categoryID`. In this case partitions an be more 1000, as I know for Kafka it is not good from performance point, is not? –  May 16 '20 at 19:47
  • So. also about your second remark, should I create topic films, when film is added to this topic I must get category of film and get what next? –  May 16 '20 at 19:49
  • you just need to go through the records of the partition, do you mean through which records? –  May 16 '20 at 19:50
  • 1
    @AliceMessis I wouldn’t say so. Partitioning is the ultimate way to achieve high(er) performance in Kafka (assuming your overall configuration and architecture make sense). – Giorgos Myrianthous May 16 '20 at 19:51
  • @AliceMessis Gi through the subscriptions for that category (i.e. the partition) in order to get all user ids which are subscribed to that category. – Giorgos Myrianthous May 16 '20 at 19:51
  • How to do that, using streams and put filterd data to users topic in partitin user_id? –  May 16 '20 at 19:52
  • 1
    @AliceMessis Kafka Streams or a simple Kafka consumer. – Giorgos Myrianthous May 16 '20 at 19:53
  • If to use Kafka streams, where to put filtered data (users_id)? Or I need add found films to another partition with key user_id? –  May 16 '20 at 19:55
  • @AliceMessis If you have one partition per category you don’t need to filter anything. Just consume all the messages in that partition. – Giorgos Myrianthous May 16 '20 at 20:01
  • Do you mean partition per category for films topic? `Films.100 -> [0,1,2...]` ? or for user_subscriptions: `user_subscriptions.100 -> [0,1,2...]` –  May 16 '20 at 20:03
  • I can not get how to organize transmit data to subscribers that new film came using stream and where to store filtered data (films) after I get user_id –  May 16 '20 at 20:09
  • @AliceMessis Sorry but I am not sure I follow. – Giorgos Myrianthous May 16 '20 at 22:18