0

I have a lambda function which does some work. I wanted to create a cloudwatch alarm on it for duration of lambda, i.e. how much time this lambda is taking to run?

I tried to use the following values for the alarm but I am getting a issue with this alarm, probably due to cold start problem. Following are the values I am setting:

Statistic : Average
ComparisonOperator : "GreaterThanThreshold"
Threshold: 1000
EvaluationPeriods: 5
Period: 60
Unit: Milliseconds

The issue I am facing with this is that, it keeps getting into alarm because of the cold start problem probably since it does not get called that often.

What is the best values to set for lambda? How other people are setting alarms on lambda?

Also, if lambda is not called for how much time, then it gets shutdown and a coldstart problem can occur?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
hatellla
  • 4,796
  • 8
  • 49
  • 101
  • 1
    What are you actually wanting to accomplish? That is, _why_ are you creating this alarm? – John Rotenstein Jul 06 '19 at 11:15
  • I agree with @JohnRotenstein Knowing the "why" can lead to better answers. You may want to consider increasing your memory as well. I've had lambdas that were 2x faster with 2x the memory (which works out to the same cost just better performance). – Jarred Olson Jul 06 '19 at 16:19
  • My lambda makes call to external services and I wanted to get notify if my lambda got slower because of the external call. So, wanted to add some metric around it. My main case is when the number of calls made by this lambda is huge, but I don't want to get alarm when the call is made at cold start time. – hatellla Jul 06 '19 at 19:16

1 Answers1

0

Use Blue Matador. The thresholds are dynamic, account for daily variation and cold starts, and use machine learning to detect real anomalies. It does the same thing for all the services that Lambda interacts with (Dynamo, SQS, API gateway, RDS, Kinesis, S3, etc.).

disclaimer: i'm the founder of Blue Matador

If you're looking to do it yourself with Cloudwatch, I would recommend timing out after a certain period of time and returning an error. Then, you can use the Errors metric to tell how many failed over a given time period. It's not a perfect solution, but it could correctly ignore cold starts. We wrote a blog about How to Monitor AWS Lambda with CloudWatch and it includes errors, throttles, and more metrics to watch out for.

mbarlocker
  • 1,310
  • 10
  • 16