4

I'm using several Windows Server 2019 Clusters (e. g. Hyper-V, File Server). On all machines that have clustered roles, I get the following errors (with different harddisk numbers):

Log Name: System
Source:   Disk
Event ID: 11
Level:    Error
Message:  The driver detected a controller error on \Device\Harddisk1\DR1.

From my observations, I can conclude, that the error is always thrown on harddisks that are currently offline on one cluster member, because they are online on another cluster member. So it happens on disks that are used by cluster roles for data and disk witness in quorum.

I'm not sure, if that's just ok in this case and that I can ignore those errors or if there is some misconfiguration and something has to be fixed.

Can someone confirm, that this is normal behaviour or that something might be broken?

stackprotector
  • 596
  • 1
  • 8
  • 27
  • Seeing the same here. It also appears to go away when anti-virus (Sophos in our case) is turned off. – Brian May 14 '21 at 15:53
  • I have Microsoft Defender Antivirus enabled. Turning it off is not an option for me. Did you find tolerable exclusions for your anti-virus? – stackprotector May 14 '21 at 20:33
  • I have not found exclusions that work yet and only turned antivirus off temporarily for testing. – Brian May 14 '21 at 20:47
  • Have you reviewed the list of exclusions specifically for clustering? https://docs.microsoft.com/en-us/microsoft-365/security/defender-endpoint/configure-server-exclusions-microsoft-defender-antivirus?view=o365-worldwide – SamErde May 19 '21 at 11:51
  • I would also check to see if an administrator has disabled the automatic exclusions feature. https://docs.microsoft.com/en-us/microsoft-365/security/defender-endpoint/configure-server-exclusions-microsoft-defender-antivirus?view=o365-worldwide – SamErde May 19 '21 at 11:52
  • One more interesting note is that "just disabling the antivirus software is insufficient in most cases. Even if you disable the antivirus software, the filter driver is still loaded when you restart the computer." (From https://docs.microsoft.com/en-US/troubleshoot/windows-server/high-availability/not-cluster-aware-antivirus-software-cause-issue) – SamErde May 19 '21 at 11:54
  • Experiencing the same issue and stumbled across this. Do you have any further information since you first found this, or even better, the ability to answer your own question for folks like me who come across this researching the same issue. – Ryan Aug 12 '21 at 16:34
  • 1
    No, no solution to this so far. – stackprotector Aug 15 '21 at 07:53
  • I'm seeing this in a nested SQL cluster on a S2D cluster, also always on the nested passive node. Disabling/whitelisting AV (Defender) did not help mitigate the issue. From a hypervisor perspective, I tried both VHD Sets and VHDX shared drives, there was no difference. Mostly this is annoying because it's triggers a SCOM alert for "delayed write failed" errors, I think just because it shares the same event ID of 11 for Disk source. I am going to have to write a hotfix MP to ignore this without disabling actual Delayed Write Failed alerts. This is all Server 2019. – Conure Nov 25 '21 at 05:18

0 Answers0