1

When a cluster is configured for High Availability (HA), VMs can be restarted onto an alternate host when the its host has failed. However, I am wondering: if the host has failed, is there a means within the vSphere HA to restart the failed host to attempt to bring it back up? Does vSphere HA just leave the failed host as is? Or must this be a manual effort for an administrator to restart the host? Thanks for any insight.

O_O
  • 635
  • 3
  • 15
  • 25
  • 1
    The ESXi host just stays as it is, showing his purple screen of death so that you can see what caused the crash. I've had memory leaks in HP smart array controllers that lead to crashes after +100 days of running which I couldn't have diagnosed without the error message and codes that are displayed on the PSOD. I haven't heard of a means to automatically reset the host in case of a crash but there could be an advanced switch similar to Windows with its blue screen of death. – megamorf Mar 25 '15 at 07:33

2 Answers2

2

This depends heavily upon the reason why your host failed... Without knowing that, it doesn't make sense to heal it automatically.

A VMware PSOD is much different than a physical hardware failure, which is different than a networking or power issue...

So the answer here is, "it depends..."

ewwhite
  • 197,159
  • 92
  • 443
  • 809
1

Good point from Deerhunter, I should double check these things.

As there is a way to change this behaviour, but check the article. You might want to view the errors/diagnostic screen.

Source: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2042500

Connect to the host via the command line using Secure Shell (SSH) or the direct console.

Note: For more information see Using Tech Support Mode in ESXi 4.1 and ESXi 5.x (1017910).

Run the command: 

esxcfg-advcfg -s seconds /Misc/BlueScreenTimeout

Where seconds is a numerical value. For example:

esxcfg-advcfg -s 120 /Misc/BlueScreenTimeout

Manual effort I'm afraid, although (hopefully) often a remote method of cycling the power of esx host is implemented. (Ilo/Drac/PDUs etc)

Vsphere (HA) won't restart the system.

Sinn3d
  • 11
  • 2
  • Could you please tell me where are the refs? [VMWare KB](http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2300), anything? – Deer Hunter Mar 25 '15 at 09:42