0

Related - but not the same: Azure Cloud Service Upgrade Domain server restart interval

I have a cloud service (extended support) in Azure with two instances of a role, in an availability set, with different value for Update Domain - 0 and 1 - as can be seen in this screenshot.

enter image description here

When a deployment runs, or VMs are being updated by Azure (e.g. windows updates, etc.), I expect that the VM in Update Domain 0 would be completed and back up to "Started" state before the VM in Update Domain 1 would start the update. Unfortunately this is not the case.

Each VM startup takes a significant amount of time (around an hour) due to having to install custom software, copy data, start relevant services, etc. What I'm seeing is that instance in Update Domain 0 is being updated first, it's in status "Starting" while the other instance is in status "Started". Then after about 30-40 minutes, while the instance in Update Domain 0 is still in "Starting" state, the second instance starts its update process - resulting in both VMs being in "Starting" state - and as a result, the service is down for up to 20 minutes or so until the first instance completes the update.

I have another similar cloud service with 3 instances - I'm seeing exact same behaviour there. There's a period of about 5 minutes when all 3 instances are in status "Starting", even though they are in 3 different Update Domains.

Am I missing something or is this a bug of some sort in Azure Fabric? Maybe there's a hard limit on how long the fabric will wait before proceeding with the next instance?

EDIT: Here's a screenshot of both instances being updated at the same time, even though they are in different Update Domains:

enter image description here

Aleks G
  • 936
  • 2
  • 8
  • 18

0 Answers0