1

Around one year ago I installed an ESXi 5.5 to virtualize an old physical Windows 2000 server that, unfortunately, cannot be easily upgraded/migrated/replaced. Main ESXi datastore was created on top of a RAID5 array built around 4x2TB internal SATA disks.

Once converted, I run common sanitization steps, including the installation of proper VMware Tools.

Right after the W2K, another couple of VMs have been defined on the very same ESXi, including a CentOS VM that, as for the W2K, is running 24/7/365.

I've configured the "Virtual Machine Startup/Shutdown" so that all the VMs need to be manually started, after a restart of the hypervisor.

Everything is running correctly, with only an exception: if a power-cut is suffered by the hypervisor (while the VMs are powered UP), while the CentOS VM is "powered up" within seconds, the W2K VM takes more than one hour. Please note that I'm not talking about the time required by Windows 2000 to complete the boot process. I'm talking about the time required by ESXi to start-up the VM. In detail:

  • time t0: hypervisor is started, after a power-cut;

    ...waiting some time just to be sure that ESXi properly completed its own boot process...

  • time t1: right click on W2K VM => Power => Power On

    ...message "VM name is starting" appears in the notification area, with the progress bar going from 0 to 45% in about 1 second ...

    ...message "VM name is starting" stay there, fixed at 45%, for around 80 (eighty!) minutes, or even more ...

    ...progress bar suddenly goes from 45% to 100%...

    ...message "VM is powered on" appears in the notification area...

  • time t2: VM console start showing that W2K started the boot process

    ...two/three minutes are needed to W2K to complete its own boot process...

  • time t3: W2K services are available on the network.

Let me insist in saying that 80 minutes are spent between t1 and t2 (and not between t2 and t3).

Here below you can find some real number, taken from today's power-cut (...and yes! There's no UPS protecting such a server. Please, be kind and don't ask for details!):

enter image description here

As you can see by yourselves:

  • power got back right before 17:21;
  • ESXi completed the boot around 17:21;
  • I did the "Power on" at 18:41:53;
  • ESXi told me that the VM was powered on at 20:15:51;
  • at the same time (20:15:51) ESXi complained that VMware tools were not installed on the VM

As for this last point (missing VMware tools), they are properly installed, as you can see below:

enter image description here

Please note that such a delay is not registered on the CentOS VM: it's powered up, by ESXi, within a couple of seconds from the "Power On" request.

I suspected that ESXi is going to perform a sort of "sanity check" of the filesystem hosting the VM data and as the W2K VM has:

  • disk1: 67GB
  • disk2: 1 GB
  • disk3: 610GB
  • disk4: 360GB
  • disk5: 1,9TB

probably it takes lot of time to check everything. I tried to check, via ESXi console, if there were some "fsck-like" process running but found nothing. I also searched for log files but... being unable to find anything relevant.

Also: if the power-cut happens when the VM is powered-off, than a normal, quick, power-on is registered and all of the above does not apply.

So the question is: does someone have a clue about what the problem is, and how can I solve or troubleshoot it?


Update 1:

It looks like ESXi needs lots of time "opening disks". I just found this:

~ # grep took /vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/vmware.log
2015-08-21T18:15:51.282Z| vmx| I120: DISK: Opening disks took 5636937 ms.

and 5636937 ms are exactly 93 minutes!

Just above that logline I see details for each of the 5 disks (lines starting with # have been added by me, for clarification):

# server_2k_COMPANY_1-000003.vmdk => DISK 2 - 1 GB
2015-08-21T16:41:54.613Z| Worker#1| I120: DISKLIB-LIB   : Opened "/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_1-000003.vmdk" (flags 0xa, type vmfs).
2015-08-21T16:41:54.613Z| Worker#1| I120: DISK: Disk '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_1-000003.vmdk' has UUID '60 00 c2 93 75 e2 fd 3d-40 43 f8 25 f0 5a b8 9b'
2015-08-21T16:41:54.613Z| Worker#1| I120: DISK: OPEN '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_1-000003.vmdk' Geo (517/128/32) BIOS Geo (0/0/0)
#
# server_2k_COMPANY-000003.vmdk => DISK 1 - 67 GB
2015-08-21T16:43:40.637Z| Worker#0| I120: DISKLIB-LIB   : Opened "/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY-000003.vmdk" (flags 0xa, type vmfs).
2015-08-21T16:43:40.637Z| Worker#0| I120: DISK: Disk '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY-000003.vmdk' has UUID '60 00 c2 9a 0a c0 10 79-6e 68 52 0d 0d 7a 23 cb'
2015-08-21T16:43:40.637Z| Worker#0| I120: DISK: OPEN '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY-000003.vmdk' Geo (8844/255/63) BIOS Geo (0/0/0)
#
# server_2k_COMPANY_3-000003.vmdk => DISK 4 - 360 GB
2015-08-21T16:45:28.355Z| Worker#3| I120: DISKLIB-LIB   : Opened "/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_3-000003.vmdk" (flags 0xa, type vmfs).
2015-08-21T16:45:28.355Z| Worker#3| I120: DISK: Disk '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_3-000003.vmdk' has UUID '60 00 c2 95 13 e3 ad 8d-62 5f 94 12 52 53 de 82'
2015-08-21T16:45:28.355Z| Worker#3| I120: DISK: OPEN '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_3-000003.vmdk' Geo (47834/255/63) BIOS Geo (0/0/0)
#
# server_2k_COMPANY_2-000003.vmdk => DISK 3 - 610GB
2015-08-21T17:09:00.957Z| Worker#2| I120: DISKLIB-LIB   : Opened "/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_2-000003.vmdk" (flags 0xa, type vmfs).
2015-08-21T17:09:00.957Z| Worker#2| I120: DISK: Disk '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_2-000003.vmdk' has UUID '60 00 c2 9d 4d d2 57 e6-da 9e 1b 8c 54 ab 39 62'
2015-08-21T17:09:00.957Z| Worker#2| I120: DISK: OPEN '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_2-000003.vmdk' Geo (79736/255/63) BIOS Geo (0/0/0)
#
# server_2k_COMPANY_4-000001.vmdk => DISK 5 - 1,9 TB
2015-08-21T18:15:51.268Z| vmx| I120: DISKLIB-LIB   : Opened "/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_4-000001.vmdk" (flags 0xa, type vmfs).
2015-08-21T18:15:51.269Z| vmx| I120: DISK: Disk '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_4-000001.vmdk' has UUID '60 00 c2 91 b1 7d cd 92-c5 13 e7 aa fc 81 b1 33'
2015-08-21T18:15:51.269Z| vmx| I120: DISK: OPEN '/vmfs/volumes/54b687a6-e5de9210-27fe-549f35062eb4/server_2k_COMPANY/server_2k_COMPANY_4-000001.vmdk' Geo (253981/255/63) BIOS Geo (0/0/0)
#
2015-08-21T18:15:51.282Z| vmx| I120: DISK: Opening disks took 5636937 ms.

So it took:

  • 2 minutes to open DISK1 (67GB);
  • 2 minutes to open DISK4 (360GB);
  • 24 minutes to open DISK3 (610GB);
  • 66 minutes to open DISK5 (1,9TB);

I wonder why DISK3 and DISK5 were so time-consuming.

As for snapshots, here below you can see the current status: enter image description here


Damiano Verzulli
  • 4,078
  • 1
  • 21
  • 33
  • For testing can you remove the snapshot to consolidate the disk http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2003638 I suspect its your snapshot that slow you down – yagmoth555 Aug 22 '15 at 00:01

0 Answers0