4

I'm creating the backend of my application using the API Gateway with Lambda functions and I'm having problems with the response time of the requests.

It's already known that the Lambda functions have the infamous "cold start" and ok, we already accept that. But the problem I'm having seems to be is a new cold start, this time by the API Gateway. And is not a few ms of standby time, it's seconds (around 12-15 seconds). Oh god, this is a big problem...

This response delay occurs for 12-15 seconds on the first request and after some inactivity (approximately 1 hour).

My question is: what could be causing this delay and how to fix it?

More information:
My lambda function is configured to run on a VPC.

CloudWatch logs from API Gateway:

(01) Extended Request Id: XXXXX=
(02) Verifying Usage Plan for request: XXXXX. API Key: API Stage: XXXXX
(03) API Key authorized because method 'GET /XXXXX' does not require API Key. Request will not contribute to throttle or quota limits
(04) Usage Plan check succeeded for API Key and API Stage XXXXX/v1
(05) Starting execution for request:
(06) HTTP Method: GET, Resource Path:
(07) Method request path:
(08) Method request query string:
(09) Method request headers:
(10) Method request body before transformations:
(11) Endpoint request URI:
(12) Endpoint request headers:
(13) Endpoint request body after transformations:
(14) Sending request to XXXXX
(15) Received response. Integration latency: 14497 ms
(16) Endpoint response body before transformations:
(17) Endpoint response headers:
(18) Method response body after transformations:
(19) Method response headers:
(20) Successfully completed execution
(21) Method completed with status: 200
(22) AWS Integration Endpoint RequestId :
(23) X-ray Tracing ID : 
Lucas Kauer
  • 278
  • 1
  • 3
  • 12
  • Hey Lucas, can we see your Lambda code please, looks like it's bricking out without returning a proper http response – Mrk Fldig Dec 25 '18 at 17:14
  • Unfortunately I can't share the code here. But everything is fine with the http response (except for the delay)... what I really wanted was to understand a little better how the API Gateway communicates with a lambda inside a vpc... – Lucas Kauer Dec 26 '18 at 17:53
  • The code it's basically a lambda that fetches the data from an aurora database and returns... – Lucas Kauer Dec 26 '18 at 17:55

2 Answers2

5

Update 14/12/19:

AWS have introduces provisioned Lambda : https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/

So few things to bear in mind here, a cold start occurs when the container Lambda runs in is effectively "decommissioned" - what that means is AWS infrastructure has dropped it from being "ready" to "nobody's really using this, let's shelve it".

Lambda's outside a VPC can have a cold start time up to 6 seconds, INSIDE a VPC you can be looking at anywhere up to 12 seconds PER CONTAINER, so just because you have one Lambda instance warm, if two people hit that endpoint at the same time then the 2nd person is going to get a cold start.

So as Mr Dashmug right suggests having a scheduled function to warm up your lambda's is the easy way, now one thing to remember is that your function would probably warm 1 container, if you're expecting hundreds of requests a second you need to keep X number of containers warm.

For an example of how to make this easy you can look at this - it's a plugin for the serverless framework that does exactly what you're looking for.

Essentially you need a function that's going to make X number of concurrent requests per endpoint - be aware this has a cost although you can keep a pretty decent microservice warmped up like this for less than $30 a month.

Personally, I think cold starts are overegged - sure customers can occasionally suffer a slow response, but if your API has relatively stable traffic then I really wouldn't worry your customers will keep the right number of Lambda's warm, if it's prone to spikes then it's worth warming them up.

Think about this, my average request time for API's I work on is < 400ms - so I'd need 2 requests a second, 120 a minute, 7200 an hour to even start needing two containers all the time - if you have something like app where people login, then call an api endpoint for homescreen you could do something as simple as Login->SNS fire a warmup event to the next endpoint.

Basically if you know the flow that your consumer is going to call the api, you can proactively warmup endpoints depending on the previous one called.

Mrk Fldig
  • 4,244
  • 5
  • 33
  • 64
1

API Gateway has no cold starts, AFAIK.

That delay after one hour of inactivity is still Lambda's cold start.

To prevent that, you can create a CloudWatch Scheduled Event to keep calling your Lambda (e.g. every 5 minutes) to avoid inactivity and to reduce cold starts.

This is less of a problem once you are in production and your traffic is already high so there is less inactivity.

Noel Llevares
  • 15,018
  • 3
  • 57
  • 81
  • I know the lambda cold start problem, but this problem is in the API Gateway. I don't know if is a cold start, but there are an delay... – Lucas Kauer Dec 20 '18 at 19:14
  • 1
    How did you conclude that it was from API Gateway? – Noel Llevares Dec 20 '18 at 19:16
  • Because I saw the cloudwatch logs and the execution of my lambda is in less than 1s... and the timeout time of my lambda is set to 3s... – Lucas Kauer Dec 20 '18 at 19:19
  • I haven't experienced this. Try enabling logging on your API Gateway and you'll see the logs in CloudWatch for the Gateway (it's a different log group from your Lambda). – Noel Llevares Dec 20 '18 at 19:23
  • So after using his advice to keep it warm, you are still seeing the lag? – Mikes3ds Dec 20 '18 at 23:38
  • That integration latency is caused by your Lambda cold start. – Noel Llevares Dec 20 '18 at 23:40
  • Time taken to provision the ENI for the lambda that runs inside a VPC doesn't count against execution time. The ENI is provisioned before the lambda execution starts. This is 100% a lambda cold start issue. – cementblocks Dec 21 '18 at 02:08