I have attached CloudWatch logs trigger to my lambda (lambda is with concurrency=1). The lambda makes Athena query which costs us money.
The problem is if I have 10 (cloud watch log) files dumped in 2 second time, the lambda is invoked 10 times --This is costly to me because the lambda runs a costly Athena query.
What I want is to trigger the lambda once every 5minutes (like in DynamoDB trigger. Exactly like @MrOverflow said in the comments) if only there was a CloudWatch log generated in last 5minutes. How do I do this (preferably without writing code)?
Edit 1
I can't have fixed 5 minutes trigger, as this will trigger the Athena query even when there are no activity around.
Edit 2
This is the solution I think will work. But I am not sure how to implement it.
have trigger on cloud watch > Timestamp column with expression like:
Timestamp in second
/5*60
%
0
--> The advantage here is, all night/ holidays when there is no traffic my Athena will not run. Also, my current lambda will get trigged every 5 minutes.However the downside of the solution is that, if the upstream is not generating log exactly at
5th minute second
then the lambda is not triggered. Also, if you have 10 logs in the same second then the lambda gets triggered 10 times.The other approach in mind is, to trigger the lambda every 5minutes by cloud watch. But maintain stage in DynamoDB. If the lambda is not triggered in last 5minutes then ignore the call from the cloud watch.
This involves coding which I hate to do.