I'm using OVirt version 4.2.3.8-1.el7 connected to 2 IBM PureFlex servers with 10 nodes (5+5) in total.
OVirt suddenly lost connection to all the nodes but the VMs in these nodes are working without a problem. I'm receiving the following error for all the nodes:
VDSM Node6 command GetCapabilitiesAsyncVDS failed: Message timeout which can be caused by communication issues
Nodes are reachable with SSH and I can do SSH to these nodes from OVirt management machine.
I've restarted OVirt management server once and it could connect to nodes for a while but the problem re-occurred after a while.
Can anyone help me how this can be fixed?