Basics: Server with 4 drives, 2 solid state. One of the solid state drives appears to have failed. Running VMWare 5.0
We tried to distribute the VMs over several disks and using RAID, but I'm not sure if it was setup incorrectly. We tried to ensure that if one of the disks ever failed, we would still be ok. However, it may have had opposite effect. Here is the startup error:
Failed to start the virtual machine.
Module DiskEarly power on failed.
Cannot open the disk '/vmfs/volumes/54d9758a-23d4381c-9118-40167e7bd317/atlassian.somedomain.com/atlassian.somedomain.com_9-000003.vmdk'
or one of the snapshot disks it depends on.
5 (Input/output error)
under properties for the VM, I can see: Shows Disabled Drive in Settings
Here are the drives when SSH'ing into the VMWare server: Shows list of available drives
Here are the contents of HDD1: Contents of HDD1's folder
Contents of HDD2: Contents of HDD2's folders
Contents of SSD1: Contents of SSD1's folder
Finally, When I look at SSD1s’ atlassian.somedomain.com.vmx file, I can see:
Note the reference to SSD2 (54d9758a-23d4381c-9118-40167e7bd317) looking for atlassian.somedomain.com_9-000003.vmdk
What’s strange is that some of the other VMs don’t have the same problem, even if they do share files on that same failed drive.
I'm not sure how to proceed, and before I make a 'final' error, I wanted to get feedback on next steps.
I could:
1) Delete the affected Hard Disk from the VM's Hardware list: delete drive
2) Alter SSD1s’ atlassian.somedomain.com.vmx file, instead pointing to version _8 (instead of the missing 9)
3) Any other suggestions?
NOTE: The purple you see in the images is my covering up of the actual domain name.
EDIT: Note that I understand I may end up losing _10, _11 if they are all interdependent - as I may have to move all back to _8. If need be, so be it. I just need to get as much recovered as possible.