AWS autoscaling policy terminating busy instances instead of idle

Question

I have a autoscale policy the scales my backend instances based on overall group cpu usage. AWS have a few difference Termination Policy's to choose from such as OldestInstance, OldestLaunchConfiguration, NewestInstance and ClosestToNextInstanceHour.

Unfortunately non of these help address my problem. If my scale-in policy trigger is set to a low 10% for the group, it can end up deleting the instance that is still busy rather than choosing one with an idle cpu.

Has anyone a suggestion are workaround? Also my backend instances aren't using an internal ELB.

So what is your load balancing strategy? How could you get to a state where one instance has > 10% CPU utilization while another is idle? — Mike Brant, Oct 22 '14 at 14:11
Hi Mike These backend instances are workers so don't require load balancing. They pull jobs off a queue and process them. You can create Auto scaling groups without having a load balancer, in this instance a load balancer wouldn’t be required as the workers work independently of one another. My scale in cloudwatch alarm is set for cpu average 10%. I can’t see a way to remove the idle instance rather than allowing AS to randomly pick a busy one. — Niall, Oct 22 '14 at 15:00
OK. That makes sense. I don't think you are going to find any out-of-the-box capability to do what you are looking for. You will likely have to poll your instances and trigger the instance shut down yourself through some process. — Mike Brant, Oct 22 '14 at 16:05
I recently discovered this API call while trying to accomplish essentially the same thing you are trying to do: http://docs.aws.amazon.com/AutoScaling/latest/APIReference/API_TerminateInstanceInAutoScalingGroup.html — Bradley T. Hughes, Nov 07 '14 at 09:23

John Rotenstein · Answer 1 · 2014-11-25T21:08:15.303

Scaling Policies are used to change the Desired Capacity of an Auto Scaling group. These scaling policies can be triggered from an AWS CloudWatch alarm or can be triggered via an API call.

Once Auto Scaling decides to terminate an instance in response to a scaling policy, it uses a Termination Policy to determine which instance to terminate. However, there is no capability for Auto Scaling to inform the instance that it is about to be terminated. As you say, this could result in a busy instance being terminated.

There are several ways to handle this:

Allowing the termination, but recovering the work that was lost
Using Auto Scaling group lifecycle hooks
Controlling instance termination yourself

Allowing the termination to happen is perfectly acceptable if your auto scaling group is processing information from a work queue, such as Amazon Simple Queue Service (SQS). In this case, if an instance pulls a message from an SQS queue, the message will be marked as invisible for a period of time. If the instance does not specifically delete the message within the time period, then the message will reappear in the queue. Thus, the work that was lost will be reprocessed.

Using Auto Scaling group lifecycle hooks allows an instance marked for termination to be moved into a Terminating:Wait state. A signal is then sent via SNS or SQS, and Auto Scaling waits for a signal to come back before terminating the instance. See Auto Scaling Group Lifecycle.

Controlling instance termination yourself means that your own code will determine which instance to terminate. It could do this in a friendly manner by sending a signal to your application on the chosen instance, effectively telling it to finish processing work and signal back when it is ready to be terminated. There are no standard APIs for this functionality -- you'd have to create it yourself, possibly triggered by a CloudWatch alarm and SNS notification.

You can use the DetachInstances API call to remove an instance from an auto scaling group, after which you would finish jobs and then terminate the instance.

Thanks John, our current setup is based on an scale-in policy using the statistic of (SUM) of workers = <2% instead of Average. This is still slightly risky. I like your message queue approach, something i hadn't thought about. I will speak to our team about this thanks. — Niall, Nov 25 '14 at 14:02
I use a technique based on lifecycle hooks that I've described in this question: http://stackoverflow.com/questions/17526570/how-can-i-prevent-ec2-instance-termination-by-auto-scaling/26851319#26851319 — nerff, Feb 06 '16 at 21:21

cr3a7ure · Answer 2 · 2022-05-21T12:17:27.763

This question is old, but I have not found anything prettier, so I'll give my perspective.

In a scenario where service workers:

can be easily spawned
pick up jobs from a pile
then stay idle until further bulk of work comes in

you have to handle termination by yourself.

So we have an auto-scaling group in one direction (up) and controlled down-scaling. The way to control it, as Bradley says in comments:

https://docs.aws.amazon.com/autoscaling/ec2/APIReference/API_TerminateInstanceInAutoScalingGroup.html

A design I came up in a same situation is the following.


    ┌──────────────────┐            ┌───────────────────────┐
    │ @EventBridge     │    2       │   CloudWatch          │
    │  ( every 1 min ) ├───────────►│ get-metric-statistics │
    │                  │            │                       │
    │     Lamdba       │            └───────────────────────┘
    └─┬──┬─────────────┘
      │  │
      │  │      ┌────────────────────────────────────────────┐
      │  │      │     AutoScaling Group                      │
      │  │  1   │                                            │
      │  └──────┼─►describe-auto-scaling-groups              │
      │     3   │                                            │
      └─────────┼─►terminate-instance-in-auto-scaling-group  │
                │    --should-decrement-desired-capacity     │
                │                                            │
                └────────────────────────────────────────────┘

And in ordered steps:

Get the instances inside the auto-scaling group ( aws cli )
Get the statistics from every instance ( aws cli ), such as average CPUUtilization or others for a period of time.
Decide inside the Lambda function what to terminate
Set for termination the desired instances ( aws cli ) and update the desired capacity in the same time.

Beware that if you try to terminate below the minimum capacity you will get an error. Handle that properly, or even update minimum capacity.

score 0 · Answer 3 · answered Oct 15 '20 at 01:56

0

You can also protect specific instances from scale-in

https://aws.amazon.com/blogs/aws/new-instance-protection-for-auto-scaling/

The blog post is out of date, but this feature is still available

answered Oct 15 '20 at 01:56

Scott Morken

1,591
1
11
20

AWS autoscaling policy terminating busy instances instead of idle

3 Answers3