0

I've 2 Windows servers R2 hosting Exchange 2013 in DAG. Every night during backup time ( 1:00 a.m) users notice unavailability of OWA (endless mailbox loading) or they can successfully login but can't see any messages or send mail. Also during that time I'm getting "Microsoft-Windows-Failover Clustering" error code 1135. Could someone advice on this? Where and what to check ? I'm total noob in Exchange.

Event 1135: EVENT_NODE_DOWN

Cluster node '%1' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

Event 5377: OPERATION_EXCEEDED_TIMEOUT An internal Cluster service operation exceeded the defined threshold of '%2' seconds. The Cluster service has been terminated to recover. Service Control Manager will restart the Cluster service and the node will rejoin the cluster.

Those are happen for each node in cluster.

  • How did you backup?What software?Is your backup pausing the VM (if it is a VM) at any stage during the backup? Have you also logged for the backup software? – Jayce Jan 16 '20 at 07:34
  • Ye,sorry, it's 2 hardware servers and backup performs via internal windows archive application, full vss backup. – Никита Jan 16 '20 at 08:25

1 Answers1

0

VSS writers freeze databases when they capture a snapshot to ensure that no data changes as the backup is being taken. The only time you should ever see a major pause will be during the agent's first backup, when a full backup of the entire server is needed. This can be done during the night to prevent impact to the database.

Jayce
  • 804
  • 5
  • 5
  • So,why when passive host backups it loses it's dag membership? – Никита Jan 24 '20 at 11:10
  • Could you provide the detailed information about the error? – Jayce Jan 28 '20 at 08:41
  • Added errors to the main body of question. – Никита Jan 29 '20 at 09:15
  • I think the error message explain it: VSS writers freeze databases when they capture a snapshot--- so it lost connection---"This could also be due to the node having lost communication with other active nodes in the failover cluster.”--- it was removed from the active failover cluster membership – Jayce Jan 30 '20 at 06:50