12

I have a pubsub topic with a number of pull subscriptions. I would like some mechanism where I can publish a message with a "priority" label that causes the message to jump as near to the front of the queue as possible.

I don't need any guaranteed ordering semantics, just a "best effort" prioritization mechanism.

Is anything like this possible with pubsub?

Alex Flint
  • 6,040
  • 8
  • 41
  • 80
  • 1
    GCP Pub/Sub is a subset of the logical functions of other Pub/Sub engines. For example, if we look at JMS as an alternative Pub/Sub technology, we find the notion of "Setting Message Priority Levels" (https://javaee.github.io/tutorial/jms-concepts004.html). However, such a concept isn't present in GCP Pub/Sub. – Kolban Jan 13 '20 at 04:37

2 Answers2

12

No such mechanism exists within Google Cloud Pub/Sub, no. Such a feature really only becomes relevant if your subscribers are not able to keep up with the rate of publishing and consequently, a backlog is building up. If subscribers are keeping up and processing and acking messages quickly, then the notion of "priority" messages isn't really necessary.

If a backlog is being built up and some messages needs to be processed with higher priority, then one approach is to create a "high-priority" topic and subscription. The subscribers subscribe to this subscription as well as the "normal" subscription and prioritize processing messages from the "high-priority" subscription whenever they arrive.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46
0

Providing an example implementation to @Kamal's answer in an attempt to provide more context to:

...prioritize processing messages from the "high-priority" subscription whenever they arrive

import logging
import threading
from google.cloud import pubsub
from google.cloud.pubsub_v1.types import FlowControl

logging.basicConfig(format="%(asctime)s %(message)s", level=logging.INFO)

c = threading.Condition()
n_priority_messages = 0

def priority_callback(message):
    logging.info(f"PRIORITY received: {message.message_id}")
    global n_priority_messages
    c.acquire()
    n_priority_messages += 1
    c.release()
    handle_message(message)
    logging.info(f"PRIORITY handled: {message.message_id}")
    c.acquire()
    n_priority_messages -= 1
    if n_priority_messages == 0:
        c.notify_all()
    c.release()

def batch_callback(message):
    logging.info(f"BATCH received: {message.message_id}")
    done = False
    modify_count = 0
    global n_priority_messages
    while not done:
        c.acquire()
        priority_queue_is_empty = n_priority_messages == 0
        c.release()
        if priority_queue_is_empty:
            handle_message(message)
            logging.info(f"BATCH handled: {message.message_id}")
            done = True
        else:
            message.modify_ack_deadline(15)
            modify_count += 1
            logging.info(
                f"BATCH modifyed deadline: {message.message_id} - count: {modify_count}"
            )
            c.acquire()
            c.wait(timeout=10)
            c.release()

subscriber = pubsub.SubscriberClient()

subscriber.subscribe(
        subscription=batch_subscription,
        callback=batch_callback,
        # adjust according to latency/throughput requirements
        flow_control=FlowControl(max_messages=5)
)

pull_future = subscriber.subscribe(
        subscription=priority_subscription,
        callback=priority_callback,
        # adjust according to latency/throughput requirements
        flow_control=FlowControl(max_messages=2)
)

pull_future.result()

Example output when there is a backlog of priority and batch messages:

...
2021-07-29 10:25:00,115 PRIORITY received: 2786647736421842
2021-07-29 10:25:00,338 PRIORITY handled: 2786647736421841
2021-07-29 10:25:00,392 PRIORITY received: 2786647736421843
2021-07-29 10:25:02,899 BATCH modifyed deadline: 2786667941800415 - count: 2
2021-07-29 10:25:03,016 BATCH modifyed deadline: 2786667941800416 - count: 2
2021-07-29 10:25:03,016 BATCH modifyed deadline: 2786667941800417 - count: 2
2021-07-29 10:25:03,109 BATCH modifyed deadline: 2786667941800418 - count: 2
2021-07-29 10:25:03,109 BATCH modifyed deadline: 2786667941800419 - count: 2
2021-07-29 10:25:03,654 PRIORITY handled: 2786647736421842
2021-07-29 10:25:03,703 PRIORITY received: 2786647736421844
2021-07-29 10:25:03,906 PRIORITY handled: 2786647736421843
2021-07-29 10:25:03,948 PRIORITY received: 2786647736421845
2021-07-29 10:25:07,212 PRIORITY handled: 2786647736421844
2021-07-29 10:25:07,242 PRIORITY received: 2786647736421846
2021-07-29 10:25:07,459 PRIORITY handled: 2786647736421845
2021-07-29 10:25:07,503 PRIORITY received: 2786647736421847
2021-07-29 10:25:10,764 PRIORITY handled: 2786647736421846
2021-07-29 10:25:10,807 PRIORITY received: 2786647736421848
2021-07-29 10:25:11,004 PRIORITY handled: 2786647736421847
2021-07-29 10:25:11,061 PRIORITY received: 2786647736421849
2021-07-29 10:25:12,900 BATCH modifyed deadline: 2786667941800415 - count: 3
2021-07-29 10:25:13,016 BATCH modifyed deadline: 2786667941800416 - count: 3
2021-07-29 10:25:13,017 BATCH modifyed deadline: 2786667941800417 - count: 3
2021-07-29 10:25:13,110 BATCH modifyed deadline: 2786667941800418 - count: 3
2021-07-29 10:25:13,110 BATCH modifyed deadline: 2786667941800419 - count: 3
2021-07-29 10:25:14,392 PRIORITY handled: 2786647736421848
2021-07-29 10:25:14,437 PRIORITY received: 2786647736421850
2021-07-29 10:25:14,558 PRIORITY handled: 2786647736421849
...
  • Well I wouldn't really recommend doing something this sophisticated. You might starve your non-priority queue with this approach, particularly as you start to handle more messages. Also, this approach may not work quite as expected when distributed over multiple servers. – Alex Flint Sep 16 '21 at 15:37
  • @AlexFlint I see your point about starving the non-priority queue. `batch_callback(message)` should release a message rather than modifying its deadline indefinitely. The subscriber described above is meant to be run on each pod of a horizontally scaling deployment. So ideally, the messages in the non-priority queue would be handled once the deployment scales up to a sufficient size. – Joshua Postel Sep 22 '21 at 20:10