4

I have a VM scale set that I want to set up auto-scaling for and I want to know how abrupt scaling down is. Before VMs get destroyed, I want to make sure any active long-running requests complete. Is this possible?

I am curious about the following:

  • How does auto-scaling decides which VMs to destroy when scaling down?
  • Is there any notification inside the VM that it is scheduled to be destroyed?
  • Can a VM that is scheduled to be destroyed control when it gets destroyed (and hold off destruction until all requests are complete)?

The VMs in my scale set will be behind a load balancer and I need to be able to drain connections (remove VMs from the backend pool) before destruction.

gregjhogan
  • 2,095
  • 19
  • 21

2 Answers2

2

How does auto-scaling decides which VMs to destroy when scaling down?

By default, auto-scaling will delete the larger Instance ID (for example, instances ID are 0,2,3, vmss will delete 3). We can use powershell to get the vmss vms' instance id.

PS C> Get-AzureRmVmssvm -ResourceGroupName "vmss" -VMScaleSetName "vmss"

ResourceGroupName   Name Location            Sku Capacity InstanceID ProvisioningState
-----------------   ---- --------            --- -------- ---------- -----------------
VMSS              vmss_0   westus Standard_D1_v2                   0         Succeeded
VMSS              vmss_2   westus Standard_D1_v2                   2         Succeeded

Is there any notification inside the VM that it is scheduled to be destroyed?

As far as I know, autoscale notifies the administrators and contributors of the resource by email, VM will not receive the notification.

Can a VM that is scheduled to be destroyed control when it gets destroyed (and hold off destruction until all requests are complete)?

We can't hold off destruction until all requests are complete for now.

In most cases, we deploy vmss with load balancer which using a "round-robin" approach, the VMSS instances will not receive requests until the instances were deleted.

I want to make sure any active long-running requests complete. Is this possible?

As far as I know, we can choose different OS metrics for autoscale, but we can't make sure VMSS will delete vm instances after the long-running requests complete.

Jason Ye
  • 13,710
  • 2
  • 16
  • 25
  • So how does anyone use a scale set behind a load balancer? It seems logical there would be some way to take down the probe and drain connections before scaling down, right? – gregjhogan Feb 24 '17 at 14:51
  • you are right, but we can't add some settings to take down the probe in azure load balancer. – Jason Ye Feb 28 '17 at 08:57
  • 1
    Is there any way to get exact time when auto scaled occurred in Past?? – AskMe Dec 03 '18 at 17:03
  • Lack of a pre-destroy hook seems very problematic. I'm trying to auto-scale Presto for example. It has a graceful shut-down API I could call but if the VM literally just deletes itself randomly without asking things to shut down, of course it will cause problems. Auto scaling seems a little useless in this regard. – John Humphreys Jan 04 '19 at 15:38
  • @JasonYe, I'm interested in cloud services (classic) auto-scaling in. Is it the same case for cloud services with respect to not waiting for requests to complete before stopping the service? I was looking for some documentation, but all I could find was a few lines on the https://learn.microsoft.com/en-us/azure/architecture/best-practices/auto-scaling that talks about long running tasks. – Douglas Waugh Jul 02 '20 at 11:58
2

The autoscaling has several policies by which it selects which VMs to remove on scale-in, for example "NewestVM" will remove the ones which launched last, you can read more here: https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-scale-in-policy

Regarding notification inside the VM about termination, there's a new feature called "termination notification" that sends an event which you can read from localhost metadata, for example

curl -s -H "Metadata:true" "http://169.254.169.254/metadata/instance?api-version=2019-06-01"

Read more here: https://azure.microsoft.com/en-us/blog/azure-virtual-machine-scale-sets-now-provide-simpler-management-during-scalein/

The VM can either wait for termination timeout, or send a signal to metadata (POST request) to proceed with termination before timeout.

To drain connections, one of the methods is to block health probe IP address 168.63.129.16, so the VM will be "unhealthy" in load balancer or application-gateway, depends what you use, and no new traffic will be sent while old existing traffic will still be active.

Dmitry Shmakov
  • 738
  • 7
  • 9
  • 1
    finally a reasonable solution exists - more than 3 years later :) – gregjhogan Aug 19 '20 at 21:12
  • @gregjhogan but a funny thing is: it is impossible to remove a single VM of a scale-set from Azure load-balancer :))) you can either remove all scale-set as a group, or not at all. So it renders scale-set kinda useless for connection drain scenario when VMs scaled back in. – Dmitry Shmakov Aug 20 '20 at 22:04