2

I think this is more of a 'architecture design' question.

I have a lambda producer that will put ~600 messages on a SQS queue (there are multiple producers) as a batch (so not 1 message with a body of ~600 messages). A consumer lambda that will take individual messages and deal with them (at scale). What I want to do is run another lambda when each batch is complete.

Initial ideas was to create a 'unique batch number', a 'total batch number' and a 'batch position number' and add it to the messages attributes for every message. And then in the consumer lambda check the these to decide if the batch is complete.

But does that mean I would need to use a FIFO queue and partition on the batch number and only have one lambda consumer per batch. Or do I run some sort of state management in DynamoDB (is the a pattern out there for this? please guide me on this).

Regards, J

J IOI
  • 35
  • 5
  • If you have multiple (concurrent?) producers, how will you know when you have sent the last message of the batch? – jarmod Nov 06 '20 at 13:58

1 Answers1

1

It seems like the goal is to achieve Fork-Join capabilities in a distributed system. One way to handle this in AWS is using Step Functions. Assuming a queue service needs to be used, state of the overall operation will need to be tracked. Some ways to do this are:

  1. Store state of the overall operation in a DB.
  2. Put a 'terminatation' message in the queue after all others and process FIFO.
  3. Create a metadata service which receives 'start' and 'stop' messages for each service and handles them accordingly.

Reference: Fork and Join with Amazon Lambda

mike1234569
  • 636
  • 2
  • 5
  • Thanks for pointing me in the right direction. I will look into using Steps with A Dynamic Parallelism using a MAP. – J IOI Nov 08 '20 at 08:54