So I just have a question in regards to load balancers which relates to one of our our systems. So we have a system which sits behind an ALB with a fleet of EC2 instances which deal with requests, these requests based on the type of request are forwarded downstream to other components and eventually reside within dynamodb. This system is essentially the entry point into our system and if we know an event is coming we can appropriately scale up our instances to deal with the spike. The problem arises when we have an unexpected spike it traffic, normally within 60 seconds which taxes the instances and the load balancer is unable to scale up in time, what we find is new instances come into play well after the incident is over.
Currently we scale based on CPU threshold. My question is, is there any other metric or method we could use to scale fast when a large spike in traffic comes into play. The easiest solution really that I can see is to throw more instances into play permentantly but this isn’t the most cost effective solution.
Thanks in advance for any guidance you can provide.