4

I'm working on a demo for azure functions using queue triggers. I created a recursive Sudoku solver to show how to take depth first search and convert to using queued recursion. The code is on github.

I was expecting it to scale out and process an insane number of messages per second, but it is barely processing 30/s. The queue is filling up and the utilization seems minimal.

Minimal utilization

How can I get better performance from this? I tried increasing the batch size in the host.json, but didn't seem to help. I have over 200k messages in the queue and it's growing.

Update 1 I tried setting the host.json file as

{
  "queues": {
    "visibilityTimeout": "00:00:10",
    "batchSize": 32,
    "maxDequeueCount": 5,
    "newBatchThreshold": 100
  }
}

but request per second remained the same.

I deployed the same function to another instance, but tied it to S4 service plan. This is able to process about 64 requests per second, but still seems slow.

I can serial process the messages locally way faster than this.

Update 2

I scaled the S4 to 10 instances and each instance is handling about 60-70 requests per second. But that's insanely expensive to still not be able to process as fast as I can with a single core locally. The queue used with the service plan functions has 500k messages piled up.

Janusz Nowak
  • 2,595
  • 1
  • 17
  • 36
MPavlak
  • 2,133
  • 1
  • 23
  • 38
  • 1
    Is it running on consumption plan? Have you tried setting `newBatchThreshold` to higher value (e.g. 100)? – Mikhail Shilkov Jul 16 '17 at 18:40
  • Yes, consumption. From the comments in the docs, it indicated a value of batchSize / 2 so that's what I used. It says batchSize max is 32 so I used batchSize = 32 and newBatchThreshold = 16. Can you elaborate on what these values do. – MPavlak Jul 16 '17 at 18:48
  • **newBatchThreshold** This is the threshold number at which we'll fetch another batch of messages. It defaults to half the batch size, meaning if we fetch a batch of 16 messages, we'll process all those in parallel and we won't fetch another batch until the outstanding invocation count drops below 8. So if you increase this to say 100 along with setting batchSize to 32, you'll be allowing 100 + 32 messages to be processed in parallel. From [here](https://github.com/Azure/azure-webjobs-sdk-script/issues/311). – Mikhail Shilkov Jul 16 '17 at 19:09
  • @Mikhail updated question to move out of comments – MPavlak Jul 16 '17 at 19:27
  • ``I can serial process the messages locally way faster than this.`` Do you process **real** messages stored in Azure storage queue or test message locally? Besides, you can try to optimize your code and minimize message content size to reduce the time that processing message takes. – Fei Han Jul 19 '17 at 09:06
  • Not using azure queues locally, but instead just processing the messages serially one at a time. The message size is super small already at ~200B. I was planning to try and increase the work description in the messages so that each message describes multiple work items. However, I feel this is going against the point of breaking up the work into small pieces and scaling out. I was also considering containers instead of AzFunctions since I do not understand why it is not scaling out further. – MPavlak Jul 19 '17 at 17:11
  • Have you tried a servicebus queue? – snowCrabs Aug 23 '17 at 15:47
  • No, but I do not believe this to be the queues fault. The function is not scaling out even though the queue is filling waaay faster than being handled. – MPavlak Aug 23 '17 at 17:54
  • @MPavlak - Where are you collating the results of the solver algorithm? I wonder if you are experiencing contention at the point where results are saved. – camelCase Sep 03 '17 at 11:45
  • @MPavlak - How long does your problem solving test run for? I ask because I assume the scale-out logic of the consumption plan for Azure Functions needs time to identify the need for extra machine resource and then +10 seconds to establish an extra Azure Function process hosting your logic. – camelCase Sep 03 '17 at 11:48
  • Each job runs for ~ 1 second, but the work was running for over 30 minutes via consumption plan. I would hope that be long enough to figure out the queue was filling up and should scale. :) – MPavlak Sep 03 '17 at 16:40
  • Quoting: "in some cases, reducing concurrency has been shown to dramatically increase throughput". https://stackoverflow.com/questions/37709255/what-is-the-scaling-algorithm-for-azure-functions-never-been-able-to-get-over-7. – Shahid Syed Mar 16 '18 at 17:01

1 Answers1

0

Azure functions do not listen for an item to be added to a queue, they actually pole the queue using a polling algorithm which you can over ride with the maxPollingInterval property. Adding "maxPollingInterval": "00:00:01" to the options you have already mentioned above should solve your problem.

maxPollingInterval azure documentaiton

DRC
  • 1