1

I know that using VMCP on VMware ESXi, it's possible to make the ESXi host shut down or restart a virtual machine when a datastore that the VM resides on goes into All Paths Down (APD) state.

I'm looking to know if it's possible to pause (as in, temporarily suspend the VM's operations but technically keep it running. Not "suspend" which is clearly impossible as the datastore is gone) a VM that resides on a datastore, when that datastore goes into APD state.

Example use case: iSCSI server has a fault and locks up, and trying to recover it exceeds the APD timeout. Currently, Linux guests will exceed /sys/block/sda/device/timeout and remount the root filesystem readonly, requiring a full reboot and fsck. If the VMs are non-critial or are themselves redundant, it might be preferable to just leave them paused until the datastore recovers (or an admin decides to restart them)

This is how VMware Workstation handles loss of a virtual disk, for example. I do understand that while Workstation has a pause function ESXi might simply lack this function, which means the answer to this question might just be "No"

Josh
  • 9,190
  • 28
  • 80
  • 128
  • If your in an APD your VM is no longer running (OS is dead). That's why VMCP only has shutdown or restart. Your example case is the same as the ADP shutdown option. Non-critical and an admin needs to restart them. – SpiderIce Oct 24 '17 at 14:24
  • 1
    I don't fully agree @SpiderIce -- linux guests, for example, aren't dead. They'll retry SCSI commands on the dead disks for `/sys/block/sda/device/timeout` seconds (which VMware defaults to 180) and then will remount the fs on those disks as readonly. But they're not dead. If they could be _paused_ before that, the ro mount could be avoided. I've experienced this with VMware Workstation – Josh Oct 24 '17 at 14:27
  • Officially it's not supported, but if you want to run unsupported you can try this process https://www.virtuallyghetto.com/2013/03/how-to-pause-not-suspend-virtual.html – SpiderIce Oct 24 '17 at 14:39
  • Thanks @SpiderIce -- that's exactly the mechanics going on at the lower OS level with Workstation. I'm wondering if ESXi can do that in response to a datastore in APD state the way Workstation does on a disk full state. If not (which seems to be the case) I'll need to script some way to handle this within the hosts and/or configure ESXi to power-off on APD, as leaving hosts online serving traffic from read-only mounts presents an availability risk. – Josh Oct 24 '17 at 14:42

0 Answers0