Strategy for managing database connections from Azure Webjobs

Question

I'm using Azure webjobs with queue-triggered functions (which rely on the Azure webjobs sdk) to perform some background processing work. Within the webjobs I make various connects to a SQL Azure database (using PetaPoco which uses System.Data.SqlClient).

I want to be purposeful in my database connection strategy - specifically because there are some concurrency issues inherent to the environment.

One concurrency scenario is with the SDK's BatchSize property that you can set for queue-triggered webjobs. It's my understanding that setting BatchSize > 1 results in multiple instances of the queue-triggered function running within the same webjob process.

The second concurrency scenario is the website scale-out scenario where you're running multiple instances of the webjob itself. These of course are in different processes.

In my website I have a database connection per request (the machine handles connection pooling by default). No problems there.

How should I treat connections in the webjob scenario, accounting for the concurrency scenarios described above? Webjobs are of course just long-lived console processes (these are continuous webjobs). Should I create a database connection when my webjob starts and simply re-use that connection through the webjob's lifetime? Should I instantiate and close connections per function when I need them?

These are the types of things I'm trying to understand.

Tom Sun - MSFT · Answer 1 · 2017-02-15T09:47:07.777

1

Webjobs are of course just long-lived console processes (these are continuous webjobs).

The main process is the long-lived processes , but for trigged sub- process will be released after the triggered function is executed. It means that connection will also be released automatically in the sub-process. For best program practices that we 'd better close it manually before exit function.

The second concurrency scenario is the website scale-out scenario where you're running multiple instances of the webjob itself. These of course are in different processes.

WebJob SDK queue trigger will automatically prevents a queue triggered by multiple instances.

If your web app runs on multiple instances, a continuous WebJob runs on each machine, and each machine will wait for triggers and attempt to run functions. The WebJobs SDK queue trigger automatically prevents a function from processing a queue message multiple times; functions do not have to be written to be idempotent. However, if you want to ensure that only one instance of a function runs even when there are multiple instances of the host web app, you can use the Singleton attribute.

It's my understanding that setting BatchSize > 1 results in multiple instances of the queue-triggered function running within the same webjob process

BatchSize it means that how many queue messages that can be picked up simutaneouly to be executed in Parallel in a WebJob.

How to use Azure queue storage with the WebJobs SDK induling parallel execution and multiple instances, we could get more info from the doucment.

edited Feb 15 '17 at 09:47

answered Feb 15 '17 at 09:17

Tom Sun - MSFT

24,161
3
30
47

Can you please explain this more? for trigged sub- process will be released after the triggered function is executed." What subprocess? About this: "WebJob SDK queue trigger will automatically prevents a queue triggered by multiple instances." I think you're referring that the SDK won't allow multiple instances to pull the same queue item. I agree but that is not relevant to this question. I'm focused on the database connection. Regarding: "BatchSize it means that how many queue messages that can be picked up simutaneouly to be executed in Parallel" Ok, then we are saying the same thing. – Howiecamp Feb 15 '17 at 15:57
You mentioned that `I'm using Azure webjobs with queue-triggered functions (which rely on the Azure webjobs sdk) to perform some background processing work.` WebJob SDK use the JobHost Object to monitor the functions ,watch for events that trigger them, and execute the function. If we try to initial variable in the function, it will be released after the function executed. – Tom Sun - MSFT Feb 15 '17 at 16:42
Thanks for your comments. Can you dumb it down for me? I'm still not sure if I see a recommendation about how to handle database connections. I understand your point that the variables on the stack will be released when the function exists - that's just normal language behavior. But are you making a recommendation on how to handle database connections as a result of this? – Howiecamp Feb 15 '17 at 20:55
So in my option, we could open the connections and close connections in the triggered function. – Tom Sun - MSFT Feb 16 '17 at 00:57

Strategy for managing database connections from Azure Webjobs

1 Answers1