I'm processing relatively large images using AWS Lambda (https://registry.opendata.aws/sentinel-2/).
In order to process these images, I split them into smaller images (~1500 "chips") which can be processed independently (the number of chips varies unpredictably depending on the content of the source image). Chips are processed in parallel using multiple invocations of a Lambda that takes in a "page" of a couple of hundred chips.
Here's where I'm stuck: when all pages have been processed, I need to combine results into a single output image, but how to know when all pages - the "variable batch of invocations" - are complete?
I've considered e.g. writing progress information to s3 or dynamo and invoking the combining function after every page so that only the last invocation of that function goes ahead (when a progress check returns as complete). I've seen options like futures/promises, but the processing time of a page of chips is of the order of 10-15 minutes so I don't want to keep a "controller" function waiting for the futures/promises to complete, because at that point it's cheaper to go with multiple invocations.
Is there a better solution that writing out progress information and checking it multiple times?
(NB I've seen this question: Fork and Join with Amazon Lambda)