0

I cannot for the life of me get a cluster created on any of my physical Windows Server 2016 Datacenter installs, and this is my Hail Mary.

I've tried fresh installs, fresh re-installs, creating the cluster via the MMC snap-in remotely, creating it via PoSH locally on the servers, creating it with just one node, moving OUs, disabling GPOs, and every combination of these...the cluster log, even in its most detailed form, is extremely vague, and the most helpful message I can find is Credentials failed to notify CAM. I'm thinking this message is the key to getting this working.

My days of searching have not uncovered anything particularly helpful or that I have not already tried. When I attempt to create a cluster on a fresh 2016 Datacenter VM within the same OU and GPOs applied as the physical servers (hosted on the same physical server that does not work, no less), it succeeds.

What gives? Are there new requirements/gotchas for WSFC in Server 2016 that I've somehow missed? I've run several 2012 R2 clusters on these same physical machines with no issue.

  • Doubt anyone have a direct answer to this yet, 2016 is still relatively new. If I were you I'd contact Microsoft support. – Noor Khaldi Feb 18 '17 at 15:15

1 Answers1

0

I got this figured out. The error was caused by an unknown problem with my Active Directory setup rooted in a series of unfortunate events. This does not come as a surprise, since I had read elsewhere that most WSFC problems stem from Active Directory/authentication/permissions issues, limitations and misconfigurations.

As far as I can tell, the first thing that went wrong was a low/out of memory condition on the domain controller VM holding all of the FSMO roles.

The next thing that happened was an unexpected reboot of this same domain controller VM, possibly from the low/out of memory condition.

At this point, I lost the ability to use any local accounts on my domain machines. Even the local built-in administrator account ceased to function, displaying The handle is invalid whenever a login was attempted. Since WSFC relies on the local account CLIUSR, this global corruption of local account functionality prevented WSFC from operating properly.

I ended up rebuilding both domain controllers for this Active Directory Site. I also continued to receive The handle is invalid errors and a subsequent non-functioning WSFC role until I completely reinstalled Windows Server on the affected machines.