0

So I just have a question in regards to load balancers which relates to one of our our systems. So we have a system which sits behind an ALB with a fleet of EC2 instances which deal with requests, these requests based on the type of request are forwarded downstream to other components and eventually reside within dynamodb. This system is essentially the entry point into our system and if we know an event is coming we can appropriately scale up our instances to deal with the spike. The problem arises when we have an unexpected spike it traffic, normally within 60 seconds which taxes the instances and the load balancer is unable to scale up in time, what we find is new instances come into play well after the incident is over.

Currently we scale based on CPU threshold. My question is, is there any other metric or method we could use to scale fast when a large spike in traffic comes into play. The easiest solution really that I can see is to throw more instances into play permentantly but this isn’t the most cost effective solution.

Thanks in advance for any guidance you can provide.

  • 1
    CPU is very generic. Does this spike exhibits any identifying characteristics? You could maybe setup scaling on custom metric from your app or from within instance to pick up such traffic and quickly trigger scaling, without waiting for CPU usage increase. – Marcin Mar 19 '21 at 11:52
  • 1
    There is no straightforward answer to this. If you know which service is usually taking more requests, you can use the Lambda [Serverless] and lambda can autoscale easily. The reason to use lambda here is, you can use Cold-Start and concurrency. – bhavuk bhardwaj Mar 19 '21 at 12:23
  • 1
    Scaling by a request count metric would probably be quicker than waiting for the CPU metrics to spike, however if all this happens in under 60 seconds you're going to have trouble scaling up in time. Can you not use any sort of caching, and maybe a CDN, to reduce the load on your servers? – Mark B Mar 19 '21 at 12:24
  • 1
    What type of instance are you using? Maybe you could take advantage of burstable instance types (T family) so in regular traffic you accumulate cpu credits which will we used during unexpected spikes of traffic – OARP Mar 19 '21 at 13:55
  • Thanks for the comments everybody. I feel like I have to avenues to investigate! – codemonkey1010 Mar 22 '21 at 14:47
  • The type of metric you use won't make much of a difference, since the cloudwatch alarm is going to take at least a minute to trigger. The other suggestions may help you out. You could also try splitting up the application and going the ECS or EKS route, that way individual services can have tasks added (either through fargate, where you don't worry about instances, or just having a large cluster of instances with all your services running on them) – Shahad Mar 23 '21 at 02:39

0 Answers0