How to debounce events on AWS grouped by a key?

Question

Our frontend application sends user actions to a lambda function behind an API gateway, which then stores these actions in dynamodb.

We then use dynamodb streams to trigger a separate lambda function that'll parse these actions in dynamodb and decide if the user's actions should result in any notifications being sent (we call these notification events).

For example, if a user places a comment in our app, we'll store a "CREATED_COMMENT" action in dynamodb, which will then trigger a new lambda through a dynamodb stream. The new lambda may then create an "email notification event", which we may send to an email provider like customer.io

However, our users have informed us that they receive emails too frequently, and thus we'd like to start sending email digests aggregating multiple actions over time into a single email rather than sending an email for each action.

Our idea was to use something like AWS EventBridge, Kinesis, Step Functions, or even DynamoDB streams to resend the dynamodb stream actions to, but then configure the new stream's events to be grouped by email address and for these events to be debounced by e.g. 10 minutes. If the user then performs a new action, that user's stream will continue gathering actions for another 10 minutes, until there's been no new actions from that user for 10 minutes. Once that happens, the stream will "release" all gathered actions and invoke a lambda function. Our lambda function will then generate the email notification event and send it to e.g. customer.io.

However, we've been unable to find such grouping and debounced flushing configuration in any of the aforementioned AWS stream services. For such a common thing as digesting (or rolling up), shouldn't there be a serverless approach to doing this without having to write our own queueing service?

There's no native AWS feature that supports debounce like this that I know of. — jarmod, Jun 24 '20 at 00:30
What if we said it'd be ok not to debounce but to simply "gather" or "cache" and then "release" the events for each email/partition every 10 mins or so? — Tom, Jun 24 '20 at 01:36

score 1 · Answer 1 · answered Jun 24 '20 at 08:13

1

The answer to me seems like using a tool such as SQS. SQS will allow you to accumulate messages into a queue and every x minutes you can then read the queue using a Lambda function to do so on a schedule event. You do not need to have a Lambda triggered by SQS, and can still read the queue "manually" from within Lambda instead.

answered Jun 24 '20 at 08:13

Gareth McCumskey

1,510
7
12

But how would we only "pull" the SQS events with a given key or email address? We wouldn't want to pull all events right, or would you then put them back in the queue? We'd want 1 lambda to be called for each key, rather than 1 lambda having to do the work of all keys/emails – Tom Jun 24 '20 at 19:24
1

It actually then sounds like Kinesis would work better. When adding items into a Kinesis stream, you can use a parition ID to split items into seperate shards. You can then also configure a batch window with a specific number of items and/or time frame to trigger the Lambda so you are always guaranteed a certain number of items or a specific time frame. – Gareth McCumskey Jun 25 '20 at 13:37

score 1 · Answer 2 · answered Jul 15 '21 at 16:26

Gareth McCumskey is on the right track.

Use a normal sqs queue for strictly for debouncing.

Set a batch window, i.e 5 seconds. Use a really large batch size when you read from the queue.

In code, use a hashMap to group your message with the same messageId together. Now use your deduped messageIDs to do your work.

score 1 · Answer 3 · answered Jul 15 '21 at 16:53

I wrote a blog post on something just like this. The short version of it is that it uses a scheduled Lambda function to identify the records that need to be processed.

The problem with using the delay in SQS is that you can only receive 10 messages at a time, so in order to get all the messages you'd have to call SQS repeatedly to clear the queue. At that point, you can aggregate the messages. This doesn't scale very well, as all the messages have to be read in order for it to work. By using DynamoDB you can actually have just one record that represents the collection of records, and query the single record, which then can result in a message in a queue for that specific group of messages. Consider the following data:

user   | comment   | time
user 1 | comment 1 | 11:43am
user 1 | comment 2 | 11:50am
user 2 | comment 1 | 11:51am

You can add another record that is a signal for the need to send a message for each user (in this example 15 minutes after the first message).

user   | scheduled
user 1 | 11:58 
user 2 | 12:06

When you insert the second set of records you are inserting the time when you want to send the batch. You only do the insert if there isn't a record already, so you don't end up constantly increasing the time. Your scheduled process reads that record to know what users it needs to send messages to and collects all the data for that user. The process of sending the messages to each user can be done in parallel (you could send a message the SQS for each user or use a Map state in a step function, for example).

How to debounce events on AWS grouped by a key?

3 Answers3