Handling the Dead Letter Queue when message processing sequence is crucial?

Question

We have a subscription on Azure Service Bus which will receive a high-throughput of messages.

Each message contains a CustomerId, and it is crucial messages for a Customer are consumed in the order they were published. We achieve this by setting CustomerId as the SessionId, and this prevents competing consumers from processing them in an out-of-sequence manner.

The problem arises when we can't process a message: i.e. because a dependency is down, or the message contents is corrupt etc. In this case, the messages are moved to the DLQ after a few attempts.

Now things are complicated, because no new messages for that customer can be processed until the DLQ is dealt with - if we were to process them we'd be doing so out of sequence which would corrupt the state of our system. But how can the consumer know some messages for this customer have been 'missed'? (and are sitting unprocessed on the DLQ)

To achieve this I've proposed we give each message a CustomerSequenceId, which the consumer checks to confirm it's processing the next expected message for that customer (by checking the message's CustomerSequenceId = LastProcessedCustomerSequenceId+1). If this is not the case, it moves the message onto the DLQ.

The messages on the DLQ are then manually checked and republished when appropriate. As long as they are published in order (and they aren't poison messages), they'll be consumed in the expected sequence and the system should recover.

This should work, but it all feels a little complicated. Have I missed any core features which would help?

Handling the Dead Letter Queue when message processing sequence is crucial?

0 Answers0