0

Background

I am running a multi-instance service under an EC2 Application Load Balancer (ALB). I am using an Auto Scaling Group (ASG) to increase and decrease instances based on load.

When the ASG does a Scale-In and terminates an instance, I need a task to be performed on the instance before it goes. So I have set up a Termination Lifecycle Hook that triggers an AWS Lambda function. This in turn calls a program on the instance that is about to be terminated, which performs the necessary task. This in itself works fine.

Problem

There seems to be a very long time lag between the following points in time:

  1. Instance is marked for termination and ALB stops forwarding requests to it.
  2. The Lambda function getting called.

It seems to take a few minutes between these 2 points.

I need to shorten this time lag as much as possible. (My reasons are complicated and probably not relevant, but I'll elaborate if asked to). Ideally it would be immediate.

However I have not found settings that I can shorten. There is a setting called "Heartbeat timeout" which has a minimum value of 30 seconds. However the time it takes for the Lambda function to get called is much longer than that. So the Heartbeat timeout doesn't seem to be bottleneck.

Does anyone know what causes this long time lag, and if there is anything I can do to shorten it?

anroy
  • 73
  • 6
  • "A scaling cooldown helps you prevent your Auto Scaling group from launching or terminating additional instances before the effects of previous activities are visible." I think you need to reduce cool-down period for auto scaling which is default 300 seconds to something lower value. https://docs.aws.amazon.com/autoscaling/ec2/userguide/Cooldown.html – Jatin Mehrotra Jun 05 '21 at 16:19
  • My "Default cooldown" under the ASG Advanced Configurations is only 15 seconds. This doesn't account for such a long time lag. – anroy Jun 05 '21 at 23:33

1 Answers1

1

Check the amount of time you have for connection draining (deregistration) delay in the load balancer.

The lifecycle hook starts after the connection draining time elapses.

https://docs.aws.amazon.com/autoscaling/ec2/userguide/lifecycle-hooks.html#lifecycle-hooks-overview

Tony
  • 11
  • 1