0

We auto-scale our Elastic Beanstalk Java Application based on the average response times exceeding 3 seconds. When this happens we add 2 instances to our environment. Once we are back within 1.5 seconds average response time then we reduce by 1 instance, with a 300 second cooldown policy.

Our new endpoint is expected to take circa 60 seconds to respond, which kinda breaks our auto-scaling model because the averages will now be heavily skewed.

Our original objective was to detect when endpoints encountered latency (we call through to third-party APIs and proxy their results - so any delays are because third parties are timing out or taking longer than planned). To date the auto-scaling worked a treat.

What options are available to us when we introduce long-running requests?

Are we to look at programmatically increasing and decreasing the number of instances based on the latency of a subset of requests, e.g. average of 3 seconds for endpoint-a and endpoint-b, but an average of 70 seconds for endpoint-C ?

We could make an assumption that if 10% of the users are using the 60-second endpoint and the other 90% are using the 1-2 second endpoint, then we could attempt to set the average higher as a compromise, however, I fear this means we won't scale up early enough for some endpoints.

Thanks,

Rob.

1 Answers1

0

Have you considered setting up a new Autoscaling group ?

Your statistical approach/thinking is fine, but will affect a certain percentage of requests in the original group. Better to keep the new group separate.

AYA
  • 111
  • 3
  • Hi @AYA Welcome to StackExchange I'm not sure a new auto-scaling group would help? Can you create a group for a specific URL pattern, if not then I don't see how I can set a different average response trigger ? – RobbiewOnline Aug 23 '18 at 21:24
  • Thanks, certainly a newbie here :-) You could use the same launch config to create a new Autoscaling group, but you need more fancy Cloudwatch alarms, right? One possible route could be to setup an alarm, that will filter for the endpoint and then average the R/T. The following link has some nice examples, https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/quickref-cloudwatchlogs.html#quickref-cloudwatchlogs-example1. Or you could get creative with what the new ASG tracks, like if there's an ELB, tracking the inbound req queued or a similar metric for Elastic Beanstalk env. – AYA Aug 24 '18 at 21:59
  • Thanks again @AYA RE:"You could use the same launch config to create a new Autoscaling group" is the launch config the Elastic Beanstalk environment config? If I was to add another launch config wouldn't that mean launching a whole new environment (new Load balancer, EC2 instances etc)? Can you setup a CloudWatch alarm to trigger when a specific request URL pattern has a high average response? Maybe I can use cloud watch to trigger a Lambda function so that I can determine which endpoints have latency then scale up / down using their API? I definitely need it automated like it is now. – RobbiewOnline Aug 27 '18 at 18:32
  • You're right about the ASG config, and there seems to be no easy way out of this. The AWS doc explicitly states one ASG and one trigger is possible for Elastic Beanstalk env-- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/environments-cfg-autoscaling-triggers.html -- So, a custom cloudwatch metric, seems to be the route to go. If you've got sample log data to parse, maybe could help with the custom alarm... [https://stackoverflow.com/questions/14955583/aws-cloud-watch-alarm-triggering-autoscaling-using-multiple-metrics]. Good luck ! – AYA Aug 27 '18 at 21:39
  • Thanks @AYA - I'll have a read up about cloud watch a bit more, but I suspect I would shy away from it and rely on my own stats that I collect each time I perform a HTTP request, I can use that as a hook and go from there. If I'm stuck or I decide to be brave with Cloudwatch I'll ping you :-) Enjoy Stack Overflow. – RobbiewOnline Aug 29 '18 at 15:29