0

I was investigating the behavior of a ChangeFeedProcessorBuilder processor1 that throws an exception or goes down while processing the particular change. Upon recovery, the same change will not be picked up anymore. Is there any way to checkpoint only after the successful processing of the notification?

The delegate is as follows:

 var builder = container.GetChangeFeedProcessorBuilder("migrationProcessor",
                       (IReadOnlyCollection<object> input, CancellationToken cancellationToken) =>
                       {
                        Console.WriteLine(input.Count + " Changes Received by " + a);
                        // just first try will fail (static variable)
                        if (a++ == 0)
                           {
                               throw new Exception();
                           }
                           return Task.CompletedTask;
                       });

Thank you!

SummerCode
  • 1,403
  • 1
  • 14
  • 28

1 Answers1

1

The default behavior of the Change Feed Processor is to checkpoint after a successful delegate execution: https://learn.microsoft.com/azure/cosmos-db/change-feed-processor#processing-life-cycle

The normal life cycle of a host instance is:

  1. Read the change feed.
  2. If there are no changes, sleep for a predefined amount of time (customizable with WithPollInterval in the Builder) and go to #1.
  3. If there are changes, send them to the delegate.
  4. When the delegate finishes processing the changes successfully, update the lease store with the latest processed point in time and go to #1.

If your delegate handler throws an unhandled exception, there is no checkpoint.

Adding from comments: The only scenario where the batch might not be retried is if the batch that throws is the first ever (lease has no Continuation). Because when the host picks up the lease again to reprocess, it has no point in time to retry from. Based on the official documentation, one lease is owned by a single instance, so there is no way that other instance could have picked up the same lease and be processing it in parallel (within the same Deployment Unit context).

Matias Quaranta
  • 13,907
  • 1
  • 22
  • 47
  • Thank you for fast reply! Have updated the description with the delegate - in this particular case, after the 1st try, the exception gets thrown, and there is no re-processing of the change - not sure why. – SummerCode Oct 27 '20 at 16:16
  • 1
    Nevermind, found your reply here github.com/Azure/azure-cosmos-dotnet-v3/issues/405 to a similar issue - this is the case i was hitting: "The only scenario where the batch might not be retried is if the batch that throws is the first ever (lease has no Continuation). Because when the host picks up the lease again to reprocess, it has no point in time to retry from." I am assuming the change should be processed in another processor if more were up at the time the 1st one failed/its node went down. – SummerCode Oct 27 '20 at 18:25
  • Awesome! Thanks for confirming, Matias! As for the latest statement, can you please confirm/infirm it? "I am assuming the change should be processed in another processor if more were up at the time the 1st one failed/its node went down." (any supporting documentation/code link would be welcome too) – SummerCode Oct 27 '20 at 23:18
  • 1
    I added a note to clarify. Sadly, within the same deployment unit, since a lease is owned by a single instance at any given time, when that instance threw the unhandled exception, there was no other instance processing the same lease that could have re-processed the same batch. Potentially the only scenario where it would recover is if you are starting the processor from the Beginning or from some specific time. Because it falls back to that time to read the first batch again. – Matias Quaranta Oct 28 '20 at 12:31