2

Today I had a power outage and switches went out. This for some reason caused my cluster to freak out (something to look into later) and one of my VMs did not start back up. After looking into why it would not start up I found that the BIN and VSV file located in the GUID folder for the VM under my CSV in the Virtual Machines folder was just missing! The GUID folder for the VM still exists but the folder is cleaned out and the BIN and VSV files (2 total) are gone!!

Currently, I do not have any snapshots of the VM at this point. It's something I was about to start doing for all my VMs on a scheduled base but have not gotten around to it yet.

What has happened here? Where did they go? Is there any way to get this back?

UPDATE #1

When I try to bring the VM online from the Failover Cluster Manager I get the following error...

Cluster resource 'Virtual Machine SERVER01' of type 'Virtual Machine' in clustered role 'SERVER01' failed. The error code was '0x2' ('The system cannot find the file specified.').

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

UPDATE #2

I also run a Cluster Validation and currently I'm running through the report that has just finished. I'm hoping that will lead me in the right direction but wanted to ask here hoping it might be a simple fix.

UPDATE #3

This seems to be the issue but I'm not sure how to find out what LUN was used in the past to select it again...

enter image description here enter image description here

UPDATE #4

OK I tracked it down to it being 1 of 2 disks and I'm not sure which one it should be. This would really be a question about either Cluster Shared Volumes and/or Cluster Quorums. I need to know which one of these might be attached to a file server as a resource? I'm not sure if both of these just "float" from cluster node to cluster node as needed or if maybe the QUORUM drive needs to be hosted by a file server etc.. Can someone maybe tell me which one of these drives might more likely be connected as a resource to my file server?

Currently it looks as if all VMs have the CSV drive (where all the VMs are kept) as a resource so I'm guessing it's not the CSV drive that I need to add back to my file server and that drive might just "float" in the cluster (for lack of a better term). Seeing as how CSV works with all nodes having a C:\ClusterStorage\Volume1 on the host nodes at the same time, my money is on the QUORUM.

Can someone maybe confirm my logic (or attack it) please?

Arvo Bowen
  • 805
  • 5
  • 17
  • 35
  • 1
    If the virtual machines are powered off then there shouldn't be an .BIN or .VSV files. Have you tried powering on the virtual machines? If so, what happens? Also, snapshots aren't backups... which is what it sounds like you intend them to be. – joeqwerty Feb 12 '18 at 15:24
  • Sorry @joeqwerty in my haste I had a malformed question without the error. It's added now. This is just something that freaked me out and I panicked running straight to SF! – Arvo Bowen Feb 12 '18 at 15:27
  • The error message makes me thing it's looking for the vhd/vhdx files for the virtual machines. Is the CSV online? – joeqwerty Feb 12 '18 at 15:29
  • Yes it is and the xml file for the VM looks ok (does not look malformed). Other VMs are using the CSV with no issues since the power has been restored. My VHDX file looks fine sitting where it always is. Maybe the config thinks it's in a different location now? Let me check that... – Arvo Bowen Feb 12 '18 at 15:32
  • Looks like that might have been the issue! Digging into it more right now but it seems that one of the disks I have setup on the VM (direct attached LUN from my storage appliance) may be offline... It says it can't find it but I can't see a place where the target was defined. What it used to be... Right now it just says "Physical drive not found". I have updated the question with an image. – Arvo Bowen Feb 12 '18 at 15:39
  • At this point I just need to find a way to convert that string (from UPDATE#3) into a readable format so I can know what LUN it's trying to use. Maybe it's the QUORUM? How can I find this out? – Arvo Bowen Feb 12 '18 at 15:52
  • @joeqwerty I think I finally found out what the issue is but need a little help with the choice I'm about to make (which drive to add to my file server). Thanks for pointing out it looking for the vhdx files... That made me go look at other things leading me down the correct path investigating. – Arvo Bowen Feb 12 '18 at 16:14

1 Answers1

1

From what I can tell in the end the Cluster Shared Volume and the Quorum seems to be functioning just fine and those are not drives that needed to be added to my file server.

I vaguely remember a while back that I had to consume some space on my storage device that I had carved out for a different purpose. When I started thinking about it I realized that what more than likely happened was I removed that Virtual Drive on my storage unit, absorbing it into another virtual drive (making it a little bigger) then simply forgetting to remove it from the VM's config in the settings GUI.

This was never an issue until the VM had to restart and could not find the old retired LUN that used to be there. My solution was simple enough in my case, just delete the SCSI hard drive configured in the VM's settings GUI. Then bring the VM back online. Prematurely freaking out about my missing BIN and VSV files was unwarranted as they are only created when the machine is up and running. They handle memory etc for the running VM. Thanks for joeqwerty to pointing this out in his comments.

Arvo Bowen
  • 805
  • 5
  • 17
  • 35