Imagine that I have a storage account with a blob container, which get files uploaded eventually. I want to process each file that reaches on the blob storage, open it, extract and store information. Definitively a expensive operation that could fit in a Durable Functions scenario.
Here's the trigger:
[FunctionName("PayrollFileTrigger")]
public static async Task Start(
[BlobTrigger("files/{name}", Connection = "AzureWebJobsStorage")]Stream myBlob, string name,
[DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
string instanceId = await starter.StartNewAsync("PayrollFile_StartFunction", "payroll_file", name);
}
...which calls the orchestration:
[FunctionName("PayrollFile_StartFunction")]
public async static Task<IActionResult> Run(
[OrchestrationTrigger] IDurableOrchestrationContext context, string blobName,
ExecutionContext executionContext, ILogger log)
{
//Downloads the blob
string filePath =
await context.CallActivityWithRetryAsync<string>("DownloadPayrollBlob", options, blobName);
if (filePath == null) return ErrorResult(ERROR_MSG_1, log);
//Extract data
var payroll =
await context.CallActivityWithRetryAsync<Payroll>("ExtractBlobData", options, filePath);
... and so on (just a sample here) ...
}
But there is a problem. While testing this error occurs, meaning, I think, that I can't start another orchestration with the same id:
An Orchestration instance with the status Pending already exists.
1 - So if I push many files to the container which the trigger is "listening", in a short period of time, the orchestration will get busy with one of them and will ignore other further events?
2 - When the orchestration will get rid of pending
status? It occurs automatically?
3 - Should I create a new orchestration instance for each file to be processed? I know you can omit the instanceId
parameter, so it get generated randomly and never conflicts with one already started. But, is it safe to do? How do I manage them and ensure they will get finished sometime?