9

I have a DynamoDB Stream that triggers a Lambda function. I'm noticing that bursts of a thousand of writes to the DynamoDB table can take many minutes (longest I have seen is 30 minutes) to all be processed by Lambda. The average duration for each Lambda invocation with batch size of 3 is around 2 seconds. These Lambdas perform I/O heavy tasks, so a small batch size and a higher number of parallel invocations is advantageous. However, the parallelism of these Lambdas is pegged to the number of DynamoDB Stream shards, but I cannot find a method for scaling the number of shards.

Is there any way to increase the throughput of these Lambdas beyond using a bigger batch size and more optimized code?

user1569339
  • 683
  • 8
  • 20

3 Answers3

6

I don't see much configuration options either.

You could decouple your processing. If your change records aren't too large your incoming Lambda could just split them up into several smaller SNS messages. Each of those smaller SNS messages could than trigger a Lambda doing the actual processing. If the changes are larger you could use SQS or S3 and trigger Lambda processing for new messages via SNS or directly for files.

Udo Held
  • 12,314
  • 11
  • 67
  • 93
4

Each stream shard is associated with a partition in DynamoDB. If you increase the throughput on your table so much that you cause your partitions to split, then, you will get more shards. With more shards the number of Lambda functions that run in parallel will increase.

Alexander Patrikalakis
  • 5,054
  • 1
  • 30
  • 48
  • Can you add a link to a doc that contains this info: "Each stream shard is associated with a partition in DynamoDB"? I am struggling to find it. – Tofig Hasanov Jan 09 '18 at 14:53
  • [This](https://docs.aws.amazon.com/streams/latest/dev/key-concepts.html) is the Kinesis stream but I am not sure if they are the same or similar. Someone said lambda will use the Kinesis stream client on your behalf. – HenryLok Mar 02 '18 at 10:49
  • I am late to the party but I found the documentation section saying that: https://aws.amazon.com/pt/blogs/database/dynamodb-streams-use-cases-and-design-patterns/ _How it works - For every DynamoDB partition, there is a corresponding shard and a Lambda function poll for events in the stream (shard)._ – Bruno Jan 26 '20 at 17:55
0

For the DynamoDB Trigger, you could try to increase the Batch Size to process records at once as much as possible to reduce the time cost.

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 21 '22 at 07:50