1

I have attached CloudWatch logs trigger to my lambda (lambda is with concurrency=1). The lambda makes Athena query which costs us money.

The problem is if I have 10 (cloud watch log) files dumped in 2 second time, the lambda is invoked 10 times --This is costly to me because the lambda runs a costly Athena query.

What I want is to trigger the lambda once every 5minutes (like in DynamoDB trigger. Exactly like @MrOverflow said in the comments) if only there was a CloudWatch log generated in last 5minutes. How do I do this (preferably without writing code)?

Edit 1

I can't have fixed 5 minutes trigger, as this will trigger the Athena query even when there are no activity around.

Edit 2

This is the solution I think will work. But I am not sure how to implement it.

  1. have trigger on cloud watch > Timestamp column with expression like: Timestamp in second/5*60 % 0 --> The advantage here is, all night/ holidays when there is no traffic my Athena will not run. Also, my current lambda will get trigged every 5 minutes.

    However the downside of the solution is that, if the upstream is not generating log exactly at 5th minute second then the lambda is not triggered. Also, if you have 10 logs in the same second then the lambda gets triggered 10 times.

  2. The other approach in mind is, to trigger the lambda every 5minutes by cloud watch. But maintain stage in DynamoDB. If the lambda is not triggered in last 5minutes then ignore the call from the cloud watch.

    This involves coding which I hate to do.

halfer
  • 19,824
  • 17
  • 99
  • 186
chendu
  • 684
  • 9
  • 21
  • I'm confused. Can you specify: do you want your lambda to be triggered by your cloudwatch trigger like it is now, or do you want it to trigger based on your cloudwatch logs? – MyStackRunnethOver Jan 21 '22 at 20:56
  • 2
    triggering lambda from cloudWatch event worked as stream of data, there is no provision you can control the number of event/stream data/ duration like we can do for dynamoDB streaming. i have another though - instead of triggering lambda from cloudWatch event why don't you trigger your lambda at 5minute frequency and read your log events. – MrOverflow Jan 21 '22 at 21:15

1 Answers1

0

Another option: Scheduled Lambda with a CloudWatch FilterLogEvents API call:

  1. A Scheduled Event triggers a Lambda every 5 minutes, a la @MrOverflow.
  2. The Lambda calls FilterLogEvents, setting the startTime param to 5 minutes ago, limit to 1, and optionally setting a filter pattern.
  3. If the response events array is not empty, at least 1 file was received in the past 5 minutes. Run the Athena query.
  4. If the response events array is empty, exit the lambda.*

The Athena job will run 0 or 1 times every 5 minutes.


* You can imagine edge cases where latencies in event triggering and logging might cause this approach to miss an eligible log event. If false negatives are a concern, consider having the lambda trigger the Athena run periodically during peak periods even if the events array is empty.

fedonev
  • 20,327
  • 2
  • 25
  • 34