I have an ASG with OldestLaunchTemplate set as the termination policy. One step of our deployment builds the app, creates a new launch template, and sets the ASG to use that current launch template and completes. The following step scales out the ASG, waits for instances to become healthy, and then scales in the ASG. While this is happening I suspend termination so additional scaling actions do not effect the deploy.
Initially I was simply setting desired/max to 2x current desired and then dropping back down to previous desired. This worked, but occasionally left behind instances running the old LT because of how the scale out/in was effected by the ASG being multi-AZ. So I updated the logic to make sure scale out happened by a minimum multiple of the number of AZs so that each AZ would have at least one old and one new instance. This worked for a bit, but now I see it continuing to terminate instances with the latest LT instead of terminating all the instances with the older LT even though instances across AZs would remain in balance.
This should be basic ASG functionality but I'm clearly missing something? What else would cause the ASG to not terminate the oldest LTs each time?