AWS autoscaling cooldown

Question

I have 1 instance. If latency more then 1 second for 3 minutes, it will add up another instance.

And here the problems arises: after 50 min(because of scaling cooldown) this 2nd instance terminates. And, if load is still high, latency jumps back to more than 1 second.

But because of scaling cooldown, it can't add up a new one again!

Is it possible to set up and down cool downs separately? Or another suggestion to solve this?

Making cooldown less not helps, instances just rises and dies more often, so application down time is still big.

score 2 · Answer 1 · answered Oct 21 '15 at 20:41

I think you should better set shorter cooldowns and play with the Cloudwatch thresholds to make them more or less sensitive depending on your needs. Typically, the cooldown should be the minimum ammount of time to allow Autoscaling to (de)commision instances and allow cloudwatch to populate itself accordingly to the new capacity and allow it to better make a new Autoscaling decision. For most web applications 10-15 minutes should be enough.

Now for the decision making, the basic rule of thumb is: Scale up fast, scale down slow. You may scale up in response to a few 1-min values over your threshold, while only scaling down from multiple per-15min metrics. For example, you could provision 50% more capacity on the event of 3 consecutive 1-min values of CPU>50% and decommision a single instance in the occurence of 4 consecutive 15-min values of CPU<25%.

But there is only 1 Breach duration in Amazon's EB and one Measurement period(for both up and down scale), so i don't have idea how to execute it — Eugene, Oct 22 '15 at 15:02

AWS autoscaling cooldown

1 Answers1