5

I am currently evaluating the use of a service bus and azure function to trigger some work that needs to be done via a downstream api call. It is all pretty standard except that I don't have a good handle on what happens when the downstream system is overloaded and/or is returning headers to throttle (ie, max # of calls per minute/etc). We don't seem to have any dynamic control over forced throttling of the queue trigger.

I know that we can manually set the max concurrency but that doesn't necessarily solve the issue in that we have no control of the down stream system and need to consider it could be offline or slow at any time.

Also, we can create the messages so they are scheduled to flow in at a certain rate but again the downstream system can still saturate or return rate limits.

Option 1:

Assuming the consumption plan, from a solution standpoint this was one way I could think of:

  • Queue1 - Full speed or max speed. If we start getting rate limited set a cache value. If the cache value is set, then don't process the message, clone it and put it in queue2
  • Queue2 - Lower per max concurrency/prefetch count. Same process as above but push into queue3.
  • Queue3 - Lowest per max concurrency/prefetch count. We just slowly process them.

Basically queue 1 becomes controller for queue2 and queue3 when the downstream system is saturated.

Option 2:

We can clone the message and requeue it in the future and keep doing this until they are all processes. Keeps one queue and just requeueing until we get it processed.

Option 3:

Assuming we have our own app plan that is dedicated instead of consumption, I guess we can Thread.Sleep the functions if they are getting close to being rate limited or down and keep retrying. This could probably be a calculation of max concurrency, instances and rate limits. I wouldn't consider this on the consumption plan as the sleeps could increase costs dramatically.

I wonder if I'm missing something simple in the best way to handle downstream saturation or to throttle the queue trigger (could be service bus or storage queue)

Edit: I would like to add that I pumped 1 million messages into a service bus queue that scheduled at the send time. I watched the Azure Function (consumption plan) scale to hit about 1500/second to provide a good metric. I'm not sure how the dedicated ones will perform yet.

It looks like the host file can be modified on the fly and the settings take effect immediately. Although this is for all the functions it may work in my particular case (update the settings and check again every minute or so depending on the rate limit).

This looks like it might be a step in the right direction when it gets implemented by the Function team, but even so, we still need a way to manage downstream errors.

lucuma
  • 18,247
  • 4
  • 66
  • 91
  • Would you like to avoid the function overload and guarantee service bus queue messages delivery? – Fei Han May 30 '17 at 09:49
  • Yes on guaranteed delivery, the messages need to be processed. Function overload I'm open to the best options. – lucuma May 30 '17 at 12:51
  • How many requests per minute do you expect your downstream to be able to handle? 1/min: timer trigger. Hundreds per second: maybe async parallel and some sort of batching? Do you need to max out the downstream system or can you afford some sort of low rate, like one that one function can do? – Volker Mar 16 '18 at 18:27
  • Let's say we have an incoming webhook from Jira, Bitbucket, appveyor, and say asana that is receiving a lot of data that is posting to slack. These webhooks could saturate the internal queue and Azure functions could easily scale up but Slack has rate limits. – lucuma Mar 16 '18 at 22:54

2 Answers2

2

Unfortunately, backpresssure features like what you need are not currently available with the Service Bus trigger (and, more generally, Azure Functions), so you would need to handle that yourself.

The approach you've described would work, you would just need to handle that in different Function Apps, as the Service Bus settings apply to the entire app and not just a function.

Might not be a huge help in your scenario, but for the apps handling queue 2 or 3, you can also try using a preview flag that will limit the number of scaled out instances of your app: WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT. Note that this is in preview and there are no guarantees yet.

I would encourage you to open an issue on the functions repo documenting your scenario. These kinds of issues are definitely something we want to address and the more feedback we have, the better.

Fabio Cavalcante
  • 12,328
  • 3
  • 35
  • 43
  • Thanks, we will probably have our own dedicated app plan after all and will manage it on various queues for now. I will open an issue. I did notice that I can update the host file but understand it applies to all queues. What I didn't see in the docs was the "max" throughput for the service bus processing assuming a small sized message and exactly the params for how scale out is calculated based on queue size. – lucuma May 30 '17 at 18:29
0

Just so, can you ask your downstream guys to offer a queue endpoint in addition to the normal api endpoint? Then you can flood their queue and they can process it at their leisure.

Volker
  • 1,753
  • 2
  • 18
  • 28
  • That works if the down stream guys work for you but not if it is a third party service that throttles requests. – lucuma Mar 12 '18 at 21:29