Site-aware cluster migrates VMs to random nodes, not to the preffered site

Question

We have 2 physical host, 4 virtual-node (HyperV Server VMs, 2 on each host) cluster which is running some nested test VMs (using mentioned HyperV Server).

In short it is ment as a lab environment where we are trying to replicate 2 rack (where each rack is the physical host), 4 server site-aware failover cluster. We have mounted iSCSI volume to the all nodes that works as CSV, where the VMs are being stored. We made fault domains where first 2 nodes are on site 1 rack 1 and the left 2 on site 2 rack 2, configured the preffered site to be the site 1. Changed the Get-ClusterGroup for VMs to be the Site 1.

The issue is - when trying to test the failover, VMs migrate to the random sites, not the preferred site 1 nodes. Why is that? Does it have to do something with the storage? (since all nodes use the same CSV and we have to implemet Storage Replica). Am I missing something?

score 6 · Answer 1 · answered Jul 21 '18 at 13:10

6

You need to ...

(1) ... use something like Windows Server 2016 built-in Storage Replica (DR, so active-passive replication only).

https://docs.microsoft.com/en-us/windows-server/storage/storage-replica/storage-replica-overview

Unfortunately it's Datacenter-only feature, there's some Lite version in Windows Server Standard but it's limited to 2TB or so, or you can ...

(2) ... use 3d party block replication tool (is DoubleTake discontinued?), or ...

(3) ... stick with iSCSI provider with own WAN-aware replication. StarWind Virtual SAN (Free?) is a good one.

https://www.starwindsoftware.com/starwind-virtual-san-free

answered Jul 21 '18 at 13:10

BaronSamedi1958

13,676
1
21
53

2

Datacenter-onlu, so that's why I could not add it on any of my nodes. Thanks. – Jon Jul 22 '18 at 07:48
After configuring the storage correctly, will the machines failover to the preferred Sites? – Jon Jul 23 '18 at 06:45
Yes, if you'll do everything properly SR will failover whole LUN with CSV and VMs running on top of it. – BaronSamedi1958 Jul 24 '18 at 07:54
1

We will try to use storage device replication solution as you mentioned in your answer, if that won't work I'll get my hands on StarWind. For now thank you! – Jon Jul 24 '18 at 13:41

score -1 · Accepted Answer · answered Jul 24 '18 at 13:38

I've managed to find a solution myself. The problem was that the other nodes had very little free resources left and because of that VMs were migrating to the less-loaded hosts. Anyhow now the preferred site aslong with the failing over to the nodes withing the site before failing over to the other is working fine. Thanks to anyone who helped.

Site-aware cluster migrates VMs to random nodes, not to the preffered site

2 Answers2