2
  • On AWS, I created an auto-scaling group with an automated scaling policy that adds a new instance based on an Application Load Balancer: Average Request Count Per Target above 5.

  • The target group is the number of HTTP requests sent to the Load Balancer.

  • The ASG is set to min 1, max 10 and desired 1.

  • I tried to send 200 requests to the ELB and record the IP of the instance that receives the request in a database. I found that most of the requests were sent to the same instance and some of them receive (Gateway Timeout 504) and few of them receive nothing.

  • The ASG launches new instances but after requests are already sent. So, the new instances receive nothing from the load balancer.

  • I think the reason is that cloud watch sends the average number of requests per instance every > 1 minute and perhaps opening a new instance happens in a longer time than the timeout of the request.

Q: Is there a method to keep the requests in a queue or increase their timeout till the new instances exist and then distribute these requests on all instances instead of losing them? Q: If the user sends many requests at the same time, I want the ASG to start scaling immediately and these requests are distributed uniformly on the instances keeping a specific average number of requests per instance.

mmaher
  • 55
  • 9

1 Answers1

2

The solution was using Amazon Simple Queue Service. We forwarded the messages from the API Gateway to the queue. Then, a cloud watch alarm was used to open ECS fargate tasks when the queue size > 1 to read messages from the queue and process them. When the queue is empty, another alarm was used to set the # of tasks in the ECS service to 0.

mmaher
  • 55
  • 9