7

I want to implement a very simple behavior in my Azure Function: if there is an exception during handling, I want to postpone the next retry for some time. As far as I know there is no direct possibility for that in the Service Bus e.g. (unless one creates a new message), but Service Bus Trigger has a possibility for ExponentialBackoffRetry.

I have not found any documentation on how that might work with regards to Service Bus Connection. I.e. what happens with the message after the execution of the function fails.

One possible way is to keep the message in functions infrastructure and keep renewing the lock for the duration I suppose. Some more practical questions on what I am wondering about:

  1. How long can I use backoff retry (e.g. if I want retry to up to 7 days e.g. will that work?)
  2. What happens when host is being reset/restarted/scaled, do I lose this backoff due to implementation details or it is still somehow maintained?
Ilya Chernomordik
  • 27,817
  • 27
  • 121
  • 207
  • Are you using Azure function 5.x? if yes, then you can define the options for retry settings. Check official documentation at [here](https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-service-bus#retry-settings). The retry setting options provide a setting to set the exponential retry. – user1672994 Jun 07 '21 at 12:24

3 Answers3

7

From the documentation:

Using retry support on top of trigger resilience

The function app retry policy is independent of any retries or resiliency that the trigger provides. The function retry policy will only layer on top of a trigger resilient retry. For example, if using Azure Service Bus, by default queues have a message delivery count of 10. The default delivery count means after 10 attempted deliveries of a queue message, Service Bus will dead-letter the message. You can define a retry policy for a function that has a Service Bus trigger, but the retries will layer on top of the Service Bus delivery attempts.

For instance, if you used the default Service Bus delivery count of 10, and defined a function retry policy of 5. The message would first dequeue, incrementing the service bus delivery account to 1. If every execution failed, after five attempts to trigger the same message, that message would be marked as abandoned. Service Bus would immediately requeue the message, it would trigger the function and increment the delivery count to 2. Finally, after 50 eventual attempts (10 service bus deliveries * five function retries per delivery), the message would be abandoned and trigger a dead-letter on service bus.

For the exponential retries, you likely need to keep the total backoff time + processing to less than what a function can hold on to the message or else the lock will expire, and even successful processing will result in an exception and retry.

The way Service Bus locks messages today, exponential backoff on top of Azure Service Bus is not a great idea. Once durable terminus is possible (unlimited lock time w/o the need to renew), this will make much more sense.

Update: Functions retry feature is being deprecated.

Sean Feldman
  • 23,443
  • 7
  • 55
  • 80
  • Thanks a lot for the details, so I guessed it right that the function host is doing automatic renewal of lock for as long as it is possible? (I could not find by the way how long that would be). – Ilya Chernomordik Jun 08 '21 at 07:27
  • Yes. Though keep in mind that that is limited to whatever Function can run for. On Premium plan it's likely different. – Sean Feldman Jun 08 '21 at 08:20
  • Are there any limitation in service bus for how long one can renew the lock, or is it infinite? – Ilya Chernomordik Jun 08 '21 at 11:01
  • Hard limit is 5, soft is infinite. But I'd not count on that as it's a client initiated request that can fail. – Sean Feldman Jun 09 '21 at 08:24
  • Thanks, I already figured (from your answer) that we cannot fully count on this exponential backoff attribute so I guess one of the ways to solve this problem is to add a new message with a schedule which is quite cumbersome. Would be nice if SB had another solution for postponing a message handling as it can now just try all 10 max attempts in a matter of seconds if they all fail rapidly. – Ilya Chernomordik Jun 09 '21 at 11:19
  • There is. It's an abstraction bon top of Functions and ASB called NServiceBus. That's what it does plus more. https://docs.particular.net/nservicebus/hosting/azure-functions/service-bus – Sean Feldman Jun 09 '21 at 14:36
  • Note that MS is going to remove this feature in a few months. – Martin Wickman Jun 20 '22 at 07:54
  • Yeah, I've added a note. Thank you. I wish they'd keep it for HTTP triggers. While one can roll their own, this should be part of the SDK. – Sean Feldman Jun 20 '22 at 15:07
3

It doesn't because that feature is soon to be removed from pretty much all triggers. At least that how I read the brand new updated documentation:

IMPORTANT: The retry policy support in the runtime for triggers other than Timer and Event Hubs is being removed after this feature becomes generally available (GA). Preview retry policy support for all triggers other than Timer and Event Hubs will be removed in October 2022.

As it stands now, your only option is to implement the retry logic yourself. It's fairly easy to to a basic retry + sleep loop in your code and you can leverage something like Polly to make it more robust. Just be wary about timeout issues in your function.

Another approach is to use scheduled messages where you publish the failing message again on the queue by give it a datetime when it should appear + you need to add some kind of custom "retry count" header which you increase each time to publish it and manually dead letter it when it has failed a number of times.

Martin Wickman
  • 19,662
  • 12
  • 82
  • 106
  • I just read this as well and did not really understand what is coming to replace it. From what I see it is just going to be removed? – Ilya Chernomordik Jun 20 '22 at 07:32
  • That's my impression as well. – Martin Wickman Jun 20 '22 at 07:53
  • The ServiceBus lacks the ability to delay the message out of the box, so the only alternative to these attributes is that if there is an error, it would be all but instantly queried 10 times in a row from the service bus , not allowing for any backoff. I still cannot understand why it is not an implemented feature in ServiceBus – Ilya Chernomordik Jun 20 '22 at 08:05
  • Yes true, I agree and it's sucks. but the "old" retrytrigger worked by using sleep retries and renew on the lock. I ranted about it [here](https://github.com/MicrosoftDocs/azure-docs/issues/90225#issuecomment-1158961056). – Martin Wickman Jun 20 '22 at 08:41
  • For the ASB trigger immediate retries was pretty much the only option. Any back-off would "eat" into the lock time. There's a [feature request](https://github.com/Azure/azure-service-bus/issues/464) to abandon a message, sending it to the end of queue that would help. What will really help, and a delay between retries would be ok, is the **new feature** mentioned [here](https://github.com/Azure/azure-service-bus/issues/470#issuecomment-1160354497). But there's another wrinkle - Functions Isolated Worker SDK, which is going to replace the older In-Process SDK, is extremely limited when it come – Sean Feldman Jun 20 '22 at 14:43
2

The retry options apply to a single service operation performed by the Service Bus SDK and are intended to allow the SDK work around short-term transient issues, like the occasional network interruption. Other than configuring the SDK clients, the Functions infrastructure is unaware of the retries and would simply see the SDK taking a longer time to perform the requested read/publish operation.

The Functions infrastructure will apply any execution time limits imposed by the runtime or may decide to take action to guard against an unresponsive service operation. (disclaimer: I can speak to the Service Bus SDK, but don't have deep insight into the Functions runtime)

The retries from the Service Bus extensions aren't applied to your Function code; on an error in your code you'll end up in an exception scenario and, depending on configuration and trigger/binding use, will either see your message abandoned or the lock held until timeout.

I'm not sure of your exact scenario, but it seems like you may want to consider deferring the message to be read explicitly at a later time or re-enqueuing the message with a schedule so that the Function can read again at a specific point in the future.

Jesse Squire
  • 6,107
  • 1
  • 27
  • 30
  • Thanks a lot for the tips, I had no idea we could defer, I'll check more on that. My scenario is simple: I want a message to be reliably processed even in the case of downtime of the services the function talks to. I.e. I want to delay the execution if it did not succeed for up to few days. Are you sure the exponential backoff is not applied? According to the answer from Sean Feldman it does renew the lock if I got it right and the docs he has posted seems to suggest the same. – Ilya Chernomordik Jun 08 '21 at 07:35
  • Is there any way to read deferred messages via Azure Function by the way? Seems they are not coming in "the ordinary" way – Ilya Chernomordik Jun 08 '21 at 07:37
  • I'm sure within the context of the [Retry Settings](https://learn.microsoft.com/azure/azure-functions/functions-bindings-service-bus#retry-settings) configuration; that translates directly to the set of `ServiceBusOptions` used to create the clients, which is what my assumption was based on the comment to your post. That said, if we're talking the ExponentialRetryBackoff attribute (which, in a re-read of your post seems more likely) then Sean's response is 100% correct and the more applicable. Apologies for any confusion. – Jesse Squire Jun 08 '21 at 13:20
  • I don't know of a way to read deferred messages directly sourced by a trigger _(though one may exist and I'm just not aware)_ In the past, when I've needed to do so, I've saved my deferred sequence numbers to a data store along with the time that I wanted to defer to. I ran my function on a timer trigger and used that data with the Service Bus client within the body of the function to process. – Jesse Squire Jun 08 '21 at 13:24
  • Thanks for help, it seems that exponential backoff attribute is not an ideal way to handle situations of prolonged downtime. Then either your trick, or perhaps sending a new message to the service bus with a schedule can do the trick. It's a pity though that service bus does not support a new schedule on an existing message. That seems like a very valuable feature as right now all 10 retries can be tried within a very short period of time. – Ilya Chernomordik Jun 09 '21 at 05:43