3

Here's a theoretical question:

When I'm building an application using message queueing, I'm going to need multiple queues support different data types for different purposes. Let's assume I have 20 queues (e.g. one to create new users, one to process new orders, one to edit user settings, etc.).

I'm going to deploy this to Windows Azure using the 'minimum' of 1 web role and 1 worker role.

How does one read from all those 20 queues in a proper way? This is what I had in mind, but I have little or no real-world practical experience with this:

Create a class that spawns 20 threads in the worker role 'main' class. Let each of these threads execute a method to poll a different queue, and let all those threads sleep between each poll (of course with a back-off mechanism that increases the sleep time).

This leads to have 20 threads (or 21?), and 20 queues that are being actively polled, resulting in a lot of wasted messages (each time you poll an empty queue it's being billed as a message).

How do you solve this problem?

knightpfhor
  • 9,299
  • 3
  • 29
  • 42
Leon Cullens
  • 12,276
  • 10
  • 51
  • 85

3 Answers3

5

I read the other answers (very good answers) and wanted to put my own spin on this.

Sticking with Windows Azure Queues, as @Lucifure was describing: I really don't see the need for multiple queues except for two scenarios:

  • You want different priorities. The last thing you want is a high-priority message getting stuck behind hundreds of low-priority messages. Create a hi-pri queue for these.
  • The number of message reads+deletes is going to exceed the target of 500 transactions per second. In this case, create multiple queues, to spread the transaction volume across storage partitions (and a storage account will handle upwards of 5K transactions per second).

If you stick with a single queue (storage-based, not service bus), you can read blocks of messages at one time (up to 32). You can easily work up a format that helps you differentiate message type (maybe with a simple prefix). Then, just hand off the message to an appropriate thread for processing. Service Bus queues don't have multi-message reads, although they do allow for prefetch (which results in buffered messages being downloaded into a cache).

An advantage of one queue over many: you remove (or greatly reduce) the problem of "many queues having no messages, resulting in empty reads."

If you need more throughput, you can always crank up the number of threads doing the queue-reading and dispatching.

Remember that each delete is atomic; no batching. And as far as queue-polling goes: you're right to think about backoff. You don't need to back off after successfully reading a message (or chunk of messages). Just back off when you don't get anything after an attempt to read.

One nice advantage over Service Bus queues: Windows Azure queues provide you with an approximate message count (which is really helpful when considering scale-out to multiple instances). Service Bus queues don't provide this.

David Makogon
  • 69,407
  • 21
  • 141
  • 189
  • But how do you know what to do with the message when you receive it? Let's assume we have a queue that stores 3 types of object: Order, Customer, Product. We receive a 'Product' object. How do you know whether the product should be added, updated or deleted? I see no clean way to handle this, except by creating a queue for each purpose. – Leon Cullens Jun 10 '12 at 12:59
  • 2
    Just format the messages with some unique prefix. Your queue-reading code then looks at the prefix and decides what to do with each message. For instance: `RENDER|\raw\image1.jpg|\rendered\image1.jpg` and `THUMBNAIL|\raw\image1.jpg|\thumbs\image1.jpg`. Parse by delimiter `'|'`, check which message type it is, and pass it to appropriate thread. Note: Queue messages are binary or string. Come up with whatever format you want. Just giving a simple example. – David Makogon Jun 10 '12 at 13:12
  • Hmm, that feels very very very dirty. I'd hoped for a cleaner solution, strange that there (apparently) isn't one. – Leon Cullens Jun 10 '12 at 13:28
  • 1
    A brokeredMessage has a ContentType Property. Messages could be of types CreateOrder, UpdateOrder, DeleteOrder, CreateCustomer, UpdateCustomer etc... – Steven T. Cramer Jul 12 '14 at 14:15
2

An alternate strategy would be to use a single or less queues such that a queue could support more that one type of message. This approach is easier to manage and cheaper if you system architecture can support it.

In the real world I have successfully used multiple queues (for scalability purposes) each queue read on a separate thread triggered by a timer event. Depending on the load on the queue and the application needs, the timer event was changed to service the queue at dynamically changing intervals.

hocho
  • 1,753
  • 1
  • 11
  • 14
  • What kind of data did you put in the queue (i.e. how did your worker know what kind of type it would get out of the queue if you have all different types mixed together)? – Leon Cullens Jun 09 '12 at 22:53
  • 3
    If you use Service Bus Queues you're working with a BrokeredMessage. This BrokeredMessage object has a property called Properties to which you can add custom information which you could use when receiving the message. – Sandrino Di Mattia Jun 09 '12 at 22:59
  • 2
    Basically using self-describing messages... I used an envelope class which wrapped the actual message object, something like class Envelope. The envelope contained the actually message object and it’s type name and was serialized to XML and placed in the queue. On reading, the XML was parsed to get the type name and then de-serialize to the actual message and dispatched to the message handler. Rather involved but more simpler approaches can be used instead. – hocho Jun 09 '12 at 23:02
  • Sandrino’s comment on using Brokered Message’s with Service Bus Queues is also a very viable strategy. – hocho Jun 09 '12 at 23:10
1

If a back-off mechanism on storage queues isn't sufficient for you I suggest you consider Service Bus Queues. With Service Bus Queues you won't have to do such aggressive polling.

You would still need to implement a loop for polling the queue, but the receive timeout makes it lighter than a constantly polling mechanism you'd have when using storage queues.

In the following example I try to receive a message from the queue. If no message is found it will keep the connection open for 30 seconds to see if anything new comes in. If no message arrived after 30 sec, the Receive method will return null (and I would have a loop trying to call Receive again). Note that the maximum timeout is 24 days.

MessagingFactory factory = MessagingFactory.Create(ServiceBusEnvironment.CreateServiceUri("sb", ServiceNamespace, string.Empty), credentials); 
QueueClient myQueueClient = factory.CreateQueueClient("TestQueue");
myQueueClient.Receive(new TimeSpan(hours: 0, minutes: 0, seconds: 30));

Popping up threads for each queue you want to read from is a good idea, but seen the capacity limitations of the CLR thread pool you should also consider receiving messages asynchronously (using TaskFactory.FromAsync for example): http://msdn.microsoft.com/en-us/library/windowsazure/hh851744.aspx

Sandrino Di Mattia
  • 24,739
  • 2
  • 60
  • 65
  • My question wasn't really about the back-off mechanism, but about the "reading from multiple queues"-part. Do you have an example of how one would implement this with FromAsync? And what about when we have 200 queues instead of 20? How do you solve that without creating too many threads? – Leon Cullens Jun 09 '12 at 22:36
  • I see now that I haven't looked at your example good enough, I thought it was just about the TPL, but it's about Azure. Reading it now :) – Leon Cullens Jun 09 '12 at 22:39