We have a tool which is used to manage a server process remotely over TCP. One of the tools features is to check if the server is running by attempting to connect to the server in a loop for a given period of time.
When we used the tool to check that a server was not running, we noticed that the accuracy of the timings were way off on Windows, e.g. attempting connections for 2s to a dead server would result in the tool running for 7s rather than the expected 2s, whereas on Linux the timings are accurate to a few milliseconds. The problem here is that we use the tool in startup scripts that the delay in declaring the server as dead increases the runtime of the startup scripts.
It turns out that the culprit seems to be the Windows TCP/IP stack: a failing connection attempt to a local port on Windows takes 2-5s to finish depending on the machine whereas on Linux it's almost instantaneous. The theory is that the Windows stack does not respect / isn't interested in the ICMP Connection Refused message returned by the server and carries on with another connection attempt.
So there's two parts to my question: i) does the above theory seem plausible, and ii) how can I tell Windows to honour the ICMP response?
-- Lauri