6

I'm running Service Bus on Azure, pumping about 10-100 messages per second.

Recently I've switched to .net 4.5 and all excited refactored all the code to have 'async' and 'await' at least twice in each line to make sure it's done 'properly' :)

Now I'm wondering whether it's actually for better or for worse. If you could have a look at the code snippets and let me know what your thoughts are. I especially worried if the thread context switching is not giving me more grief than benefit, from all the asynchrony... (looking at !dumpheap it's definitely a factor)

Just a bit of description - I will be posting 2 methods - one that does a while loop on a ConcurrentQueue, waiting for new messages and the other method that sends one message at a time. I'm also using the Transient Fault Handling block exactly as Dr. Azure prescribed.

Sending loop (started at the beginning, waiting for new messages):

private async void SendingLoop()
    {
        try
        {
            await this.RecreateMessageFactory();

            this.loopSemaphore.Reset();
            Buffer<SendMessage> message = null;

            while (true)
            {
                if (this.cancel.Token.IsCancellationRequested)
                {
                    break;
                }
                this.semaphore.WaitOne();
                if (this.cancel.Token.IsCancellationRequested)
                {
                    break;
                }

                while (this.queue.TryDequeue(out message))
                {                       
                    try
                    {
                        using (message)
                        {
                            //only take send the latest message
                            if (!this.queue.IsEmpty)
                            {
                                this.Log.Debug("Skipping qeued message, Topic: " + message.Value.Topic);
                                continue;
                            }
                            else
                            {
                                if (this.Topic == null || this.Topic.Path != message.Value.Topic)
                                    await this.EnsureTopicExists(message.Value.Topic, this.cancel.Token);

                                if (this.cancel.Token.IsCancellationRequested)
                                    break;
                                await this.SendMessage(message, this.cancel.Token);
                            }
                        }
                    }
                    catch (OperationCanceledException)
                    {
                        break;
                    }
                    catch (Exception ex)
                    {
                        ex.LogError();
                    }
                }
            }
        }
        catch (OperationCanceledException)
        { }
        catch (Exception ex)
        {
            ex.LogError();
        }
        finally
        {
            if (this.loopSemaphore != null)
                this.loopSemaphore.Set();
        }
    }

Sending a message:

private async Task SendMessage(Buffer<SendMessage> message, CancellationToken cancellationToken)
    {
        //this.Log.Debug("MessageBroadcaster.SendMessage to " + this.GetTopic());
        bool entityNotFound = false;

        if (this.MessageSender.IsClosed)
        {
            //this.Log.Debug("MessageBroadcaster.SendMessage MessageSender closed, recreating " + this.GetTopic());
            await this.EnsureMessageSender(cancellationToken);
        }

        try
        {
            await this.sendMessageRetryPolicy.ExecuteAsync(async () =>
            {
                message.Value.Body.Seek(0, SeekOrigin.Begin);
                using (var msg = new BrokeredMessage(message.Value.Body, false))
                {
                    await Task.Factory.FromAsync(this.MessageSender.BeginSend, this.MessageSender.EndSend, msg, null);
                }
            }, cancellationToken);
        }
        catch (MessagingEntityNotFoundException)
        {
            entityNotFound = true;                
        }
        catch (OperationCanceledException)
        { }
        catch (ObjectDisposedException)
        { }
        catch (Exception ex)
        {
            ex.LogError();
        }

        if (entityNotFound)
        {
            if (!cancellationToken.IsCancellationRequested)
            {
                await this.EnsureTopicExists(message.Value.Topic, cancellationToken);
            }
        }
    }

The code above is from a 'Sender' class that sends 1 message/second. I have about 50-100 instances running at any given time, so it could be quite a number of threads.

Btw do not worry about EnsureMessageSender, RecreateMessageFactory, EnsureTopicExists too much, they are not called that often.

Would I not be better of just having one background thread working through the message queue and sending messages synchronously, provided all I need is send one message at a time, not worry about the async stuff and avoid the overheads coming with it.

Note that usually it's a matter of milliseconds to send one Message to Azure Service Bus, it's not really expensive. (Except at times when it's slow, times out or there is a problem with Service Bus backend, it could be hanging for a while trying to send stuff).

Thanks and sorry for the long post,

Stevo

Proposed Solution

Would this example be a solution to my situation?

static void Main(string[] args)
    {
        var broadcaster = new BufferBlock<int>(); //queue
        var cancel = new CancellationTokenSource();

        var run = Task.Run(async () =>
        {
            try
            {
                while (true)
                {
                    //check if we are not finished
                    if (cancel.IsCancellationRequested)
                        break;                       

                    //async wait until a value is available
                    var val = await broadcaster.ReceiveAsync(cancel.Token).ConfigureAwait(false);
                    int next = 0;

                    //greedy - eat up and ignore all the values but last
                    while (broadcaster.TryReceive(out next))
                    {
                        Console.WriteLine("Skipping " + val);
                        val = next;
                    }

                    //check if we are not finished
                    if (cancel.IsCancellationRequested)
                        break;

                    Console.WriteLine("Sending " + val);

                    //simulate sending delay
                    await Task.Delay(1000).ConfigureAwait(false); 

                    Console.WriteLine("Value sent " + val);                        
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex);
            }

        }, cancel.Token);

        //simulate sending messages. One every 200mls 
        for (int i = 0; i < 20; i++)
        {
            Console.WriteLine("Broadcasting " + i);
            broadcaster.Post(i);
            Thread.Sleep(200);
        }

        cancel.Cancel();
        run.Wait();
    }
user1275154
  • 1,120
  • 1
  • 12
  • 24
  • If you have just a few threads running sync will generally give you more throughput than async because async has additional overheads. It's power only becomes visible at high concurrency. – usr Mar 25 '13 at 10:38
  • The code above is from a 'Sender' class that sends 1 message/second. I have about 50-100 instances running at any given time, so it could be quite a number of threads. – user1275154 Mar 25 '13 at 10:49

2 Answers2

5

You say:

The code above is from a 'Sender' class that sends 1 message/second. I have about 50-100 instances running at any given time, so it could be quite a number of threads.

This is a good case for async. You save lots of threads here. Async reduces context switching because it is not thread-based. It does not context-switch in case of something requiring a wait. Instead, the next work item is being processed on the same thread (if there is one).

For that reason you async solution will definitely scale better than a synchronous one. Whether it actually uses less CPU at 50-100 instances of your workflow needs to be measured. The more instances there are the higher the probability of async being faster becomes.

Now, there is one problem with the implementation: You're using a ConcurrentQueue which is not async-ready. So you actually do use 50-100 threads even in your async version. They will either block (which you wanted to avoid) or busy-wait burning 100% CPU (which seems to be the case in your implementation!). You need to get rid of this problem and make the queuing async, too. Maybe a SemaphoreSlim is of help here as it can be waited on asynchronously.

usr
  • 168,620
  • 35
  • 240
  • 369
  • Thanks for the tip!. I've had a look at SemaphoreSlim and I guess I could use the AsyncWait, but I'm nervous about some facts - 1) it's count based and I found some posts about people having problems with right way of releasing the semaphore 2) It's main purpose is if you are waiting a very short time, although I have not found anywhere what very short time is. But I assume it's much less than 1 second, which is my typical use case. Would be cool if the ManualResetEventSlim had async wait API... – user1275154 Mar 25 '13 at 22:38
  • 1) is an issue of discipline and applies to anything else, too. 2) It performs best with low contention and low waits but will perform fine even in the face of contention. Your #1 priority is to get rid of the blocking of 50-100 threads. They're wasted waiting on locks. – usr Mar 25 '13 at 22:48
  • 1) agree. The DataFlow BufferBlock<> seems to server well as both Queue and Semaphore, so I'll go for the 'simpler' for now. However I don't fully understand what do you mean by 'They will either block (which you wanted to avoid) or busy-wait burning 100% CPU (which seems to be the case in your implementation!)' I suppose this is regarding the line when I'm waiting for the semaphore. I did some testing and waiting for a ManualResetEvent doesn't seem to be taking any processor time at all, can u elaborate please what could be taking 100% CPU. Thanks – user1275154 Mar 26 '13 at 09:15
  • I probably just misread the code. It looked like a fast-running `while(true)` without any waiting. But there's the semaphore so no problem there (except for the blocking which is holding up the thread because it can't get to the async point). – usr Mar 26 '13 at 09:57
  • After problems with BufferBlock I've decided to try SemaphoreSlim, appears to work well, thanks for the tip. Hopefully in the future the ManualResetEventSlim will get Async api too, since that's a better fit for my situation. – user1275154 Mar 28 '13 at 12:02
4

First, keep in mind that Task != Thread. Tasks (and async method continuations) are scheduled to the thread pool, where Microsoft has put in tons of optimizations that work wonders as long as your tasks are fairly short.

Reviewing your code, one line raises a flag: semaphore.WaitOne. I assume you're using this as a kind of signal that there is data available in the queue. This is bad because it's a blocking wait inside an async method. By using a blocking wait, the code changes from a lightweight continuation into a much heavier thread pool thread.

So, I would follow @usr's recommendation and replace the queue (and the semaphore) with an async-ready queue. TPL Dataflow's BufferBlock<T> is an async-ready producer/consumer queue available via NuGet. I recommend this one first because it sounds like your project could benefit from using dataflow more extensively than just as a queue (but the queue is a fine place to start).

Other async-ready data structures exist; my AsyncEx library has a couple of them. It's also not hard to build a simple one yourself; I have a blog post on the subject. But I recommend TPL Dataflow in your situation.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Thanks I'll give it a try. Before I've implemented this, I've done a simple POC pure async vs TPL DataFlow, and when it comes to lots of small tasks, DataFlow was quite a bit slower (many times slower). Hence I went for the 'pure low level' async it myself solution. – user1275154 Mar 25 '13 at 18:01
  • Dataflow does have much more overhead. But if you're going to do your own `async` solution, at least take out the (synchronous) blocking. – Stephen Cleary Mar 25 '13 at 18:27
  • Thanks. Please have a look at the proposed solution using BufferBlock<> – user1275154 Mar 25 '13 at 22:37
  • That solution looks pretty good. I also encourage you to take a look at the other blocks to see if you can have more of your solution use dataflow. – Stephen Cleary Mar 26 '13 at 14:01
  • Problem - the await BufferBlock.ReceiveAsync(this.cancel.Token); gets stuck (deadlocked?)! Even when Posting more messages to the block (from a different thread), the await from ReceiveAsync never comes back. The BufferBlock.Count just grows but nothing gets processed. Any ideas how this could be happening? So far I have not been able to reproduce it on a small example, but it happens quite a lot in the production code. – user1275154 Mar 27 '13 at 09:41
  • No, that's not expected. If you have multiple consumers, only one of them would get it, but one of them *should* get it. Unless you're blocking somewhere (`Wait` or `Result`), there shouldn't be a deadlock. – Stephen Cleary Mar 27 '13 at 11:55
  • I have opened an issue on http://social.msdn.microsoft.com/Forums/en-US/tpldataflow/thread/33c2ef9e-fce6-45f3-b2c7-8a82878115ec and at this point my best guess is that there is a deadlock, so I've refactored my code to use ConcurrentStack<> & SemaphoreSlim, which seems to work well. – user1275154 Mar 28 '13 at 12:04