Original thread on VMware for reference: https://communities.vmware.com/thread/490718
Hello,
We have been troubleshooting an issue that prevents our vCenter server from connecting to some of our remote hosts. This has impacted 2 different vCenter servers running 5.1 and 5.5 on Windows Server 2008 R2 and 2012 R2.
Process leading to the error
- We are able to add hosts to a data center after a host reboot or fresh vCenter install
- If our primary data center MPLS goes down (maintenance or otherwise) we lose connectivity to all remote hosts
- One data center is able to reconnect without issue. This particular data center is our secondary data center
- No other remote sites are able to reconnect
Troubleshooting
- Disabled IPv6 across VMware infrastructure (Windows Servers, ESXi hosts)
- Increased handshakeTimeoutMs to 120000
- Restarted management network
- Cleared ARP table
- Lockdown mode is disabled
- Disabled proxy ARP across network
Notes
We have a single ESX 4.1 host that is able to reconnect without issue (has only experienced one disconnect, but came back without issue unlike the 5.5 counterpart) We're able to connect to the hosts via vSphere client and SSH without issue The network team is troubleshooting the issue as well, but we've not been able to rule out VMware as the culprit
Logs
vpxd 2014-09-24T14:00:14.785-05:00 [05920 warning 'Default'] Failed to connect socket; , >, e: system:10060(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)
2014-09-24T14:00:14.785-05:00 [05920 error 'HttpConnectionPool-000001'] [ConnectComplete] Connect failed to ; cnx: (null), error: class Vmacore::SystemException(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)
2014-09-24T14:00:14.785-05:00 [05852 error 'httphttpUtil' opID=6159800D-000000AB-d6] [HttpUtil::ExecuteRequest] Error in sending request - A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
2014-09-24T14:00:14.785-05:00 [05852 error 'vpxdvpxdHostAccess' opID=6159800D-000000AB-d6] [VpxdHostAccess::Connect] Failed to discover version: vim.fault.HttpFault
2014-09-24T14:00:14.786-05:00 [05852 info 'commonvpxLro' opID=6159800D-000000AB-d6] [VpxLRO] -- FINISH task-internal-5070 -- datacenter-31 -- vim.Datacenter.queryConnectionInfo --
2014-09-24T14:00:14.786-05:00 [05852 info 'Default' opID=6159800D-000000AB-d6] [VpxLRO] -- ERROR task-internal-5070 -- datacenter-31 -- vim.Datacenter.queryConnectionInfo: vim.fault.NoHost:
--> Result:
--> (vim.fault.NoHost) {
--> dynamicType = ,
--> faultCause = (vmodl.MethodFault) null,
--> name = "xxxesxi01.xxx.com",
--> msg = "",
--> }
--> Args:
-->
Connection error Call "Datacenter.QueryConnectionInfo" for object "XXX" on vCenter Server "VCENTER" failed.
Thanks