0

I am using Lambda function to read data from dyanmoDB streams. Lambda read items from stream and invokes lambda function once for each batch. Lambda invokes lambda function synchronously using event source mapping.

From what i understand from aws docs is, Lambda invokes a lambda function for each batch in the stream. Suppose there are 1000 items in stream instantly and I configures my lambda function to read 100 items in a batch.

So will it invoke 10 lambda function concurrently to process 10 batch of 100 items each?

I am learning AWS. Is my understanding correct? if yes what does synchronously invoked mean?

1 Answers1

0

DynamoDB uses shards* to partition the data inside a table. The data that will be stored in each shard is defined by the HashKey of the table. DynamoDB streams will trigger AWS Lambda for each shard that was updated and aggregate the shard records in a batch. So the number of records in each batch will depend on the number of records updated in each shard. They can be different of course.

Synchronously invoked means that the service that triggered the function will wait until the function ends to finish its own execution. When you trigger asynchronous, you send a request and forget about it. If the downstream function successfully process the stream or not is not in control of the upstream service. DynamoDB invokes Lambda Function synchronously and waits while it works. If it ends successfully, it will mark the stream as processed. If it ends with a failure it will retry a few more times. This is important to allow at least once processing of ddb streams.

*Shards are different partitions of the database. They allow DynamoDB to process parallel queries and updates. Normally they reside in different storages/availability zones.

Gustavo Tavares
  • 2,579
  • 15
  • 29