I have got a WebJob with the following ServiceBus handler using the WebJobs SDK:
[Singleton("{MessageId}")]
public static async Task HandleMessagesAsync([ServiceBusTrigger("%QueueName%")] BrokeredMessage message, [ServiceBus("%QueueName%")]ICollector<BrokeredMessage> queue, TextWriter logger)
{
using (var scope = Program.Container.BeginLifetimeScope())
{
var handler = scope.Resolve<MessageHandlers>();
logger.WriteLine(AsInvariant($"Handling message with label {message.Label}"));
// To avoid coupling Microsoft.Azure.WebJobs the return type is IEnumerable<T>
var outputMessages = await handler.OnMessageAsync(message).ConfigureAwait(false);
foreach (var outputMessage in outputMessages)
{
queue.Add(outputMessage);
}
}
}
If the prerequisites for the handler aren't fulfilled, outputMessages
contains a BrokeredMessage
with the same MessageId
, Label
and payload as the one we are currently handling, but it contains a ScheduledEnqueueTimeUtc
in the future.
The idea is that we complete the handling of the current message quickly and wait for a retry by scheduling the new message in the future.
Sometimes, especially when there are more messages in the Queue than the SDK peek-locks, I see messages duplicating in the ServiceBus queue. They have the same MessageId
, Label
and payload, but a different SequenceNumber
, EnqueuedTimeUtc
and ScheduledEnqueueTimeUtc
. They all have a delivery count of 1.
Looking at my handler code, the only way this can happen is if I received the same message multiple times, figure out that I need to wait and create a new message for handling in the future. The handler finishes successfully, so the original message gets completed.
The initial messages are unique. Also I put the SingletonAttribute
on the message handler, so that messages for the same MessageId
cannot be consumed by different handlers.
Why are multiple handlers triggered with the same message and how can I prevent that from happening?
I am using the Microsoft.Azure.WebJobs
version is v2.1.0
The duration of my handlers are at max 17s and in average 1s. The lock duration is 1m. Still my best theory is that something with the message (re)locking doesn't work, so while I'm processing the handler, the lock gets lost, the message goes back to the queue and gets consumed another time. If both handlers would see that the critical resource is still occupied, they would both enqueue a new message.