0

As far as I have understood, the orderingKey specifies the partition to which a message is published when publishing a message to a particular topic. I have also come to understand that affinity is provided for a given ordering key as to which subscriber instance should receive messages with a specific ordering key. Now my question:

Does a subscription need to enable ordering, i.e. ensure delivery of messages in-order for there to be affinity for an ordering key. My use case does not really require serial delivery of messages of a particular ordering key, but it does require affinity (that all messages with the same ordering key is delivered to the same subscriber instance). In other words, I need a way to ensure that a given partition is only processed by a single subscriber, but I do not care about their respective order within that partition.

muffe
  • 47
  • 7

1 Answers1

0

Even with ordered delivery enabled, the affinity in Cloud Pub/Sub is best-effort. It is possible for keys to shift among different subscribers as long as there are no messages for that key currently outstanding. This best-effort affinity only exists for subscriptions with ordering enabled. For unordered subscriptions, the ordering key is essentially ignored and messages are delivered to subscribers arbitrarily. Currently, the best way to achieve affinity is to use attributes and filtering, where subscriptions examine the same attribute and look for different values.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46
  • Thank you! That answers my question :) Best effort is more than enough as long as it mostly remains sent to the same instance when messages are delivered within a short time frame in order to batch process the messages within the same partition in the same transaction to avoid transaction contentions :) – muffe Mar 10 '22 at 11:54
  • Could you elaborate on what you mean by using attributes and filtering? I would assume you would need a separate subscription for each consumer in order to affinity If I understand you correctly? – muffe Mar 10 '22 at 12:01
  • Correct, you would have a subscription per filter. On publishing messages, you attach an attribute based on how many subscribers you want to process the messages. So maybe you take an ID, mod it by 10 and set an attribute "partition" to that value. Then you have ten subscriptions, each with a filter attributes.partition = "". You have one consumer per subscription and so all messages for the same partition go to the same subscriber. This does break the separation of publisher and subscriber a bit because the publisher needs to be aware of how many subscribers/subscriptions exist. – Kamal Aboul-Hosn Mar 10 '22 at 14:01
  • @KamalAboul-Hosn using filter attributes works well when the number of subscribers is fixed (Kubernetes stateful set for example), but in an environment that needs to auto-scale the number of subscriber instances based on load, ordering keys seem to be the best way to achieve this. I have the same problem as "muffe", in that I want to reduce write contention in the database, by having a 1-to-1 ratio between worker threads and exclusive write locks on that database record being modified as result of consuming a pubsub messages – murungu Mar 03 '23 at 18:50