1

I'm interested in seeing whether I can invoke an AWS Lambda when one of my DynamoDB tables grows to a certain size. Nothing in the DynamoDB Events/Triggers docs nor the Lambda Developer Guide suggests this is possible, but I find that hard to believe. Anyone ever deal with anything like this before?

hotmeatballsoup
  • 385
  • 6
  • 58
  • 136

1 Answers1

3

You will have to do it manually.

I see two out-of-the box ways to achieve this though:

1) You can create a CloudWatch Event that runs every X min (replace X with whatever you think is necessary for your business case) to trigger your Lambda Function. Your function then needs to invoke the describeTable API and run a check against that value. Once it has run, you can disable the event since your table has reached the size you wanted to be notified about. This is the easiest and most cost effective since most of time your tables size will be lower than your predefined limit.

2) You could also use DynamoDB streams and invoke the describeTable API, but then your function would be triggered upon every new event in your table. This is cost ineffective and, in my opinion, overkilling.

Thales Minussi
  • 6,965
  • 1
  • 30
  • 48
  • This seems like a good approach but I wouldn't use describeTable, because it returns [ItemCount](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TableDescription.html#DDB-Type-TableDescription-ItemCount), which is only updated once every 6 hours. The only way to get an updated count would be to use [Query Select Count](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html#DDB-Query-request-Select), as noted in [this answer](https://stackoverflow.com/a/27327486/7207514), or [scan, as noted in this answer](https://stackoverflow.com/a/38874596/7207514) – Deiv Feb 25 '19 at 15:18
  • 1
    Also keep in mind both of these operations (query and scan) will have to go through your whole table, which can be costly – Deiv Feb 25 '19 at 15:21
  • Deiv, thanks for your input. I did not mention scan neither query exactly because of the cost / performance issues associated with it. – Thales Minussi Feb 25 '19 at 15:26
  • But yes, you're right, if the OP needs REAL TIME data, I guess there's no other way than using scan. – Thales Minussi Feb 25 '19 at 15:26
  • ...which, to me, casts something of an antipattern shadow over the entire question. There is a *reason* why values like `ItemCount` and `TableSizeBytes` are only updated sporadically -- DynamoDB is a distributed system, with each individual table internally partitioned and replicated across different hardware, and as such DynamoDB almost certainly has no internal real-time "sense" of table size. Maintaining such information in a massive distributed system represents a signficant amount of unnecessary overhead. – Michael - sqlbot Feb 25 '19 at 16:36