4

We have a tool which is used to manage a server process remotely over TCP. One of the tools features is to check if the server is running by attempting to connect to the server in a loop for a given period of time.

When we used the tool to check that a server was not running, we noticed that the accuracy of the timings were way off on Windows, e.g. attempting connections for 2s to a dead server would result in the tool running for 7s rather than the expected 2s, whereas on Linux the timings are accurate to a few milliseconds. The problem here is that we use the tool in startup scripts that the delay in declaring the server as dead increases the runtime of the startup scripts.

It turns out that the culprit seems to be the Windows TCP/IP stack: a failing connection attempt to a local port on Windows takes 2-5s to finish depending on the machine whereas on Linux it's almost instantaneous. The theory is that the Windows stack does not respect / isn't interested in the ICMP Connection Refused message returned by the server and carries on with another connection attempt.

So there's two parts to my question: i) does the above theory seem plausible, and ii) how can I tell Windows to honour the ICMP response?

-- Lauri

liwp
  • 161
  • 4

1 Answers1

2

Someone answered that the actual response from the server is a TCP RST rather than an ICMP message, but that answer has since then been deleted.

Anyhow, I did some more digging and looked at some Wireshark traces of failing connection attempts:

i) the response is indeed a TCP RST,ACK rather than an ICMP message as I originally thought it would be

ii) the Windows TCP/IP stack is implemented to retry the connection attempt after a RST,ACK hoping against hope that the server might have magically re-appeared within the connection timeout [1]

iii) the Administrator can set the TcpMaxConnectRetransmissions registry enty in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters to a lower value than the default 3 (WinNT) or 2 (Win2k).

So, the answer is to tweak the registry and lower TcpMaxConnectRetransmissions to 0 or 1. My only problem with this 'solution' is that AFAICT this will also affects connection attempts where the initial SYN is dropped in the network in which case setting the value to 0 is a bad idea, and setting it to 1 will still result in a longer than necessary runtime for my scripts.

[1] For more information: http://support.microsoft.com/kb/175523

-- Lauri

liwp
  • 161
  • 4