0

I'm using Artillery to run a small load test performance against a REST API (Edge endpoint) deployed with AWS API Gateway by using Serverless framework

This API has a custom domain/ACM certificate configured and since I'm using Edge endpoint type it also has a CloudFront.

This is the flow for the request: Cloudfront -> API Gateway -> Lambda Authorizer -> Lambda -> Other services

Once I start running around 100 requests/per second in a period of 60 seconds (total of 6000 requests) the results are fine (only HTTP 202) but when I start with 200 requests/per second (total of 12000 requests) I start getting some errors described in Artillery as "ETIMEDOUT". By looking into CloudWatch logs I couldn't find any error related to that and there I'm only able to visualize the successful requests. I went through both lambdas metrics that are part of my flow and the metrics are only showing the number of successful invocations as well and no error on lambdas execution, e.g. no lambda timeout.

For example, on Artillery report I get 9666 successful responses and this value is the same I found for the lambda invocations.

Artillery report (example):

errors.ETIMEDOUT: .............................................................. 2334
http.codes.202: ................................................................ 9666
http.request_rate: ............................................................. 179/sec
http.requests: ................................................................. 12000
http.response_time:
  min: ......................................................................... 143
  max: ......................................................................... 601
  median: ...................................................................... 179.5
  p95: ......................................................................... 407.5
  p99: ......................................................................... 432.7
http.responses: ................................................................ 9666
vusers.completed: .............................................................. 9666
vusers.created: ................................................................ 12000
vusers.created_by_name.0: ...................................................... 12000
vusers.failed: ................................................................. 2334
vusers.session_length:
  min: ......................................................................... 190
  max: ......................................................................... 7530.3
  median: ...................................................................... 237.5
  p95: ......................................................................... 459.5
  p99: ......................................................................... 507.8

Note: There is no pattern on this "error" results. Each execution generates a different amount of "ETIMEDOUT" errors.

Artillery yml test definition

config:
  target: 'https://testing.mydomain.com'
  phases:
    - duration: 60
      arrivalRate: 200
  defaults:    
    headers:
      Authorization: 'Bearer XXXXXX'
scenarios:
  - flow:
    - post:
        url: "/create"
        json:
          clt: "{{ $randomString() }}"
          value: "10"
          prd: "abcdefg"
    log: "Sending info to {{ $randomString() }}"

By checking CloudWatch metrics for API Gateway, it seems only the successfull requests (9666 in the example above) are reaching the API. I'm checking the "count" metric:

enter image description here enter image description here

I'm wondering if there is any API limit that I couldn't find.

1 Answers1

0

I believe you will be hitting this limit here potentially.

https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html

"10,000 requests per second (RPS) with an additional burst capacity provided by the token bucket algorithm, using a maximum bucket capacity of 5,000 requests. * Note The burst quota is determined by the API Gateway service team based on the overall RPS quota for the account in the Region. It is not a quota that a customer can control or request changes to."

I could be wrong, but worth checking these limit sets.

JamesKn
  • 1,035
  • 9
  • 16
  • Hi, thanks for answering. Yes, I saw those limits but as per my understanding I'm not sending 10K rps. My load tests is generating 200 rps. – Thiago Scodeler Oct 04 '22 at 11:59
  • I am just wondering if you are getting high number of concurrent users i.e you might be sending 200 request per second but if the service takes a five seconds to responded the concurrent levels are going to get higher. There is a setting of maxVusers: which maybe worth a try. In that doc there is limits on route around 300 for example. – JamesKn Oct 05 '22 at 09:24
  • 1
    I found the issue on my side. The problem was on the instance I was using to run these load test performance, basically my local development environment does not have enough resources to support this amount of requests going to API Gateway. Once I decided to use a proper test environment, all the requests sent to the API Gateway were visible on API Gateway metrics. – Thiago Scodeler Oct 06 '22 at 17:45
  • Great that you got to the bottom of it ;) nice one. – JamesKn Oct 06 '22 at 18:53