0

I have a function that is triggered by CosmosDb insert/updates and I copy each document to a storage blob. When debugging, the function fires over and over again for the same handful of documents.

I've tried limiting the number of documents processed, but that makes it process only the same N documents over and over. I've tried raising the RUs on the trigger collection (and the lease collection) and that had no effect.

[FunctionName("Function1")]
        public async static Task Run([CosmosDBTrigger(
            databaseName: "Events",
            collectionName: "DomainEvents",
            ConnectionStringSetting = "cosmosConnectionString",
            CreateLeaseCollectionIfNotExists = true,
            LeaseCollectionName = "DomainEventLeases")]IReadOnlyList<Document> input, ILogger log, ExecutionContext context)
        {
            if (input != null && input.Count > 0)
            {
                var config = new ConfigurationBuilder()
                 .SetBasePath(context.FunctionAppDirectory)
                 .AddJsonFile("local.settings.json", optional: true, reloadOnChange: true)
                 .AddEnvironmentVariables()
                 .Build();

                CloudStorageAccount cloudStorageAccount;

                if (CloudStorageAccount.TryParse(config["StorageConnectionAppSetting"], out cloudStorageAccount))
                {
                    var client = cloudStorageAccount.CreateCloudBlobClient();
                    var container = client.GetContainerReference("wormauditlog");

                    foreach(var thisDocument in input)
                    {
                        var blob = container.GetBlockBlobReference(thisDocument.Id);

                        try
                        {
                            await blob.UploadFromByteArrayAsync(thisDocument.ToByteArray(), 0, thisDocument.ToByteArray().Length);
                        }
                        catch(Exception e)
                        {
                            throw;
                        }
                    }
                }
                else
                {
                    throw new FunctionInvocationException("Bad storage connection string.");
                }

            }
        }
Ryan T4S
  • 73
  • 1
  • 6

1 Answers1

1

Trigger does not retry document batches, you are probably receiving updates for the same documents.

If you check the thisDocument.GetPropertyValue<int>("_ts") which is the timestamp of the operation, you will see these are different values.

The Change Feed contains inserts and update operations, if your architecture updates the same document multiple times, then it is expected that there will be multiple entries in the Change Feed for the same document id.

Additionally, unrelated by would be good, you are creating a an instance of CloudStorageAccount on every execution, a good pattern is to maintain a single instance and share it, either by Dependency Injection or with a Lazy initialization (see https://learn.microsoft.com/azure/azure-functions/manage-connections).

Matias Quaranta
  • 13,907
  • 1
  • 22
  • 47
  • Re: Cosmos Trigger - I am getting triggered for the same documents over and over, yes, but it's not saving different data. It's continuously trying to update these same 8 documents with the same data that's in cosmos and isn't changing. I'm using a dedicated test collection and no inserts or updates are happening there. Also, this is an event sourcing log. Documents are written there once and never updated or deleted. I also checked out the timestamps on the documents by setting a break point, allowing the function to run, checking the data, let it run again, check again. Same _ts data. – Ryan T4S Jun 13 '19 at 23:37
  • Re: cloud storage account - Thanks for the input. This is spike code. We use Core DI in all of our production-level functions. – Ryan T4S Jun 13 '19 at 23:37
  • The Trigger code does not retry, nor resend the same batches. If you go and inspect the leases collection, you should find documents there with the `ContinuationToken` property, what is the state? Do they change the value after Function executions? – Matias Quaranta Jun 14 '19 at 00:04
  • I checked the lease collection and while the function is running, those documents never change. They always have this: "PartitionId": "0", "Owner": "33f1d3f3-114c-474c-8831-7cf345d3f62e", "ContinuationToken": "\"11\"", "properties": {}, – Ryan T4S Jun 14 '19 at 14:38
  • As an experiment, I went and started with a complete blank collection. The function starts up and creates the lease collection. It then waits for an update message from cosmos. Then I put in a single simplified document. The function, again, starts to fire on that same update over and over and over again. The Continuation token for that collection is "\"2\"". So, it really does seem to be firing it over and over again. – Ryan T4S Jun 14 '19 at 16:55
  • Can you change it to `async Task` instead of `async void`? It might be related to the fact that your async Function does not return an awaitable Task to the Functions runtime, there really is no code on the Cosmos DB Trigger that would send a batch over and over. The sample you linked is not an `async` Function, and thus, `void` is fine. – Matias Quaranta Jun 14 '19 at 18:02
  • Thanks for the suggestion. Another commentor had suggested the same thing, but it doesn't help. My next plan is to get this into Azure itself as maybe this is a problem with func.exe or something. – Ryan T4S Jul 01 '19 at 22:31