2

I have a durable functions app, running on a premium elastic service plan in Azure, wherein I

  • (a) perform a one off task that returns a potentially large number of results
  • (b) run some independent processing on each result from part (a)

Part (a) relies on an external database, which starts rejecting requests when I hit a certain number of concurrent requests.

Part (b) doesn't have such a third party dependency, and should theoretically be able to scale indefinitely.

I'm aware of the ability to place limits on:

  • The maxmimum number of instances my service plan will scale out to
  • The number of concurrent requests per-instance
  • The number of concurrent activity functions

However, using any of these options to limit (a) would also limit (b), which I'd like to leave as concurrent as possible.

Is there a way I can limit the number of concurrent invocations of activity function (a), without placing restrictions on the number of invocations of (b)?

(If all else fails I can track the number of current executions myself in storage as part of running activity (a), but I'd much prefer to either configure this, or be able to drive it from the durable functions framework, if possible - as it is already tracking the number of queued activity functions of each type.)

Scoobyben
  • 58
  • 6
  • There's a utility method to do this at https://github.com/Azure/azure-functions-durable-extension/issues/596#issuecomment-459906400 . – Arithmomaniac Nov 25 '20 at 09:17

1 Answers1

1

Is there a way I can limit the number of concurrent invocations of activity function (a), without placing restrictions on the number of invocations of (b)?

Yes, there are plenty of tools in Azure which will allow you to build publish / subscribe segregation of (a) and (b). Perhaps the mistake here is to think that the results from (a) need to be processed in-process / synchronously with the consumer which sinks / processes these results.

i.e. If there is a good chance that (b) cannot keep up with the messages retrieved from (a), then I would consider separating the task of obtaining data from (a) from the task of processing the data in (b) via a queue or log technology.

Concentrating on (b):

  • If (b) requires command or transaction semantics (i.e. exactly once, guaranteed), then Azure Service Bus can be used to queue commands until they can be processed, and consumers of messages can be scaled independently of the production of messages in (a), using subscriptions. Think RabbitMQ.
  • If (b) can handle less reliable guarantees, e.g. at-least-once semantics, then Azure Event Hubs will allow you to partition messages across multiple concurrent consumers. Think Kafka.

Other alternatives exist too, e.g. Storage queues (low cost) and Event grids (wide number of subscriber protocols).

So TL;DR, buffer the accumulation of data from the processing, if you fear that there are throughput disparities between your ability to acquire and process the data.

Interestingly enough, if the process delivering to (a) IS a queue in itself, then you need only concern yourself with the performance of (b). The golden rule of queues is to leave data on the queue if you do not have the capacity to process it (since you will just need to buffer it, again).

StuartLC
  • 104,537
  • 17
  • 209
  • 285
  • Thanks for the quick response! Interestingly, the application was originally service-bus triggered, and we migrated to durable functions due to its promise to handle a lot of the overhead work for us - queuing, handling of storing large results in temporary storage, and most importantly the ease of implementing the fan-out/fan-in pattern - which I'm particularly keen to keep around for (b), as the results all need to be stored together. – Scoobyben Aug 13 '20 at 08:14
  • I can certainly migrate part (a) back to a service bus triggered function and decouple it from (b) - I was just checking first whether there was something within the durable functions framework I'd missed that already handles this case. Migrating would mean more infrastructure to manage, another functions app as it needs independent scaling rules, and a slightly less helpful testing process, as the custom status/output of the durable functions framework would not be as easily accessible. – Scoobyben Aug 13 '20 at 08:18
  • 1
    For reference, here is the documentation for the fan-out/fan-in pattern in Azure, which alludes to the fact that you can manage such a workflow yourself with other Azure resources, but claims that durable functions provides a low overhead way of achieving the same https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-cloud-backup?tabs=csharp – Scoobyben Aug 13 '20 at 08:21
  • I'm marking this answer as accepted, but just wanted to clarify how I interpret it for posterity - while the answer starts with "yes you can do this in Azure" - I think the answer to my original question is "no, while durable functions provide a convenient wrapper around a lot of functions orchestration, they can't control concurrency on independent functions". The solution to my underlying problem, as the answer says, is to stop relying on the durable functions framework, separate the two components explicitly, and manually implement the plumbing in Azure. – Scoobyben Aug 25 '20 at 08:45
  • +1 If you have the time, why not provide a detailed answer to your own question, and then can mark your answer as correct? – StuartLC Aug 25 '20 at 09:07