I am currently evaluating the use of a service bus and azure function to trigger some work that needs to be done via a downstream api call. It is all pretty standard except that I don't have a good handle on what happens when the downstream system is overloaded and/or is returning headers to throttle (ie, max # of calls per minute/etc). We don't seem to have any dynamic control over forced throttling of the queue trigger.
I know that we can manually set the max concurrency but that doesn't necessarily solve the issue in that we have no control of the down stream system and need to consider it could be offline or slow at any time.
Also, we can create the messages so they are scheduled to flow in at a certain rate but again the downstream system can still saturate or return rate limits.
Option 1:
Assuming the consumption plan, from a solution standpoint this was one way I could think of:
- Queue1 - Full speed or max speed. If we start getting rate limited set a cache value. If the cache value is set, then don't process the message, clone it and put it in queue2
- Queue2 - Lower per max concurrency/prefetch count. Same process as above but push into queue3.
- Queue3 - Lowest per max concurrency/prefetch count. We just slowly process them.
Basically queue 1 becomes controller for queue2 and queue3 when the downstream system is saturated.
Option 2:
We can clone the message and requeue it in the future and keep doing this until they are all processes. Keeps one queue and just requeueing until we get it processed.
Option 3:
Assuming we have our own app plan that is dedicated instead of consumption, I guess we can Thread.Sleep
the functions if they are getting close to being rate limited or down and keep retrying. This could probably be a calculation of max concurrency, instances and rate limits. I wouldn't consider this on the consumption plan as the sleeps could increase costs dramatically.
I wonder if I'm missing something simple in the best way to handle downstream saturation or to throttle the queue trigger (could be service bus or storage queue)
Edit: I would like to add that I pumped 1 million messages into a service bus queue that scheduled at the send time. I watched the Azure Function (consumption plan) scale to hit about 1500/second to provide a good metric. I'm not sure how the dedicated ones will perform yet.
It looks like the host file can be modified on the fly and the settings take effect immediately. Although this is for all the functions it may work in my particular case (update the settings and check again every minute or so depending on the rate limit).
This looks like it might be a step in the right direction when it gets implemented by the Function team, but even so, we still need a way to manage downstream errors.