AWS CloudFront returns 503 for regions other than us-east-1

Question

I am trying to configure a cloudfront distribution with a lambda@Edge function linked to the origin request event. The lambda edge returns a very basic html page (the code is based on this example: Serving Static Content (Generated Response)). Once deployed, the distribution works as expected in locations close to North Virginia region, but fails in other locations returning the following error:

503: The Lambda function associated with the CloudFront distribution was throttled. We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner. If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.

I already tried looking at the logs, but nothing is logged in cloudwatch when the 503 error is thrown and the logs from the CF distribution shows the lambdalimitExceeded error.

I have been jumping around between different locations using a VPN and I find it strange that it only works for places close to us-east-1 region. I am creating all the resources using a federated account, I don't know if it could be related to IAM permissions.

Another thing to point out is that everything works as expected if I reproduce the same scenario using another aws account and a regular user.

score 0 · Answer 1 · answered Jul 07 '20 at 06:49

0

If you're seeing the lambdalimitExceeded then you need to review the following for your Lambda@Edge function:

The number of function executions exceeded one of the quotas (formerly known as limits) that Lambda sets to throttle executions in an AWS Region (concurrent executions or invocation frequency).
The function exceeded the Lambda function timeout quota.

Remember that Lambda@Edge is executed closer to the user, if you try to retrieve external resources (to the region) then you may timeout due to geographical latency, can you increase the timeout more to account for this?

Do you have other Lambdas running in the regions where it is running? If you view the CloudWatch logs for one of the regions closer to the users edge location you will see these Lambda logs and hopefully be able to identify the root cause. If not then add more debugging in.

answered Jul 07 '20 at 06:49

Chris Williams

32,215
4
30
68

Thanks for your reply. The function only returns an static text for now to make sure it is not related to timeouts or other errors. When the page is rendered as expected, the logs shows that it took less than 150ms to complete and around 60MB, so I don't think that is the reason. I have also looked for the logs on other regions and I was not able to find them. I am taking as reference the code that is returned in the x-amz-cf-pop header to find out which edge location tried to serve the content. – Oscar Jul 07 '20 at 16:01
Are you able to see any metrics in the regions that it is having these issues? – Chris Williams Jul 07 '20 at 16:02
I am able to see the logs in cloudwatch when everything works fine, but I cannot find anything (in any region) when the 503 error is returned. However, I enabled the logs in cloudfront and I am able to see the error in the logs stored in S3. – Oscar Jul 07 '20 at 16:17
From the CloudFront interface if you access monitoring, then find your Lambda@Edge and click "View function metrics" does it provide any input to specific regions? – Chris Williams Jul 07 '20 at 16:20
It provides the number of invocations but they are only registered on successful calls. The error graph is always empty. Also, if I look at the distribution metrics, the total5xxErrors shows some values, but the 5xxErrorsByLambdaEdge remains at 0. – Oscar Jul 07 '20 at 16:40
That is really strange, did you have an example of the log with `lambdalimitExceeded ` in it? – Chris Williams Jul 07 '20 at 16:43
I don't have them at this time, but I will try to gather them. However, what I found out that the problem happens with that specific account. The account belongs to an organization, I tested the same code in another account of the same org and everything worked as expeced. Could it be related to a policy at an account level? – Oscar Jul 07 '20 at 19:19
1

Thats possible if something was perhaps in a SCP. Sounds like you're getting there, very peculiar though. Sorry I couldn't be more help in this scenario :) – Chris Williams Jul 07 '20 at 19:21
SCPs can be used to disable actions in certain regions. Normally you'd expect exceptions for global services like cloudfront and cloudwatch but its possible there isnt one in this case. For example this scp shows how to block regions but allow cloudfront services to continue running https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_example-scps.html#example-scp-deny-region – mhbrooks Jul 07 '20 at 22:44
An SCP would also explain why cloudwatch metrics are not being posted. – mhbrooks Jul 07 '20 at 22:47
Indeed it could :) – Chris Williams Jul 08 '20 at 06:11
thanks for the help. I followed your suggestions and the account does not have any SCP rules set. I also recreated the AWSServiceRoleForLambdaReplicator and AWSServiceRoleForCloudFrontLogger roles, but it did not work. I will try to contact support. thanks! – Oscar Jul 08 '20 at 16:52
Any luck with this issue? I'm experiencing a similar problem. Users close to us-east-1 access the resources fine, but in places like Texas for example they get 503. – Jose Quijada Dec 06 '22 at 14:18

AWS CloudFront returns 503 for regions other than us-east-1

1 Answers1