12

We have a server application which is facing TCP exhaustion problems at around 4000 connections. This will occur every 3 or 4 weeks (approximately). The vendor, which has created this server application tells us after examining the output of netstat -b that some connections are remaining open even if the clients have dropped.

I've been given the task of investigating why a particular client application is not closing the TCP connection properly. I'm of the belief that if a client computer is shut down, that it can not POSSIBLY report from the server that a TCP connection is still established with that client. Unfortunately, I can not find any information to validate my view. I don't want to waste any more time investigating a potential problem that I don't think can even be a problem.

tldr;

Can a server report an established connection to a computer that is turned off?

Josh Smeaton
  • 1,340
  • 2
  • 19
  • 31

3 Answers3

13

TCP makes no effort to detect a dead connection except on a side that is transmitting data. It is the responsibility of the application code calling into the TCP stack to do this. What protocol is involved here? (The one on top of TCP.)

It's a horrifically ugly "solution", but you can enable TCP keepalives. There's more in this article.

David Schwartz
  • 31,449
  • 2
  • 55
  • 84
  • You probably meant layer and its the session layer on top of transport layer, which is where TCP resides. – Rilindo Dec 07 '11 at 01:38
  • 1
    @Rilindo: In practice, and in this particular case, you have an application making calls to the TCP stack. The protocol on top of TCP (HTTP, POP, or whatever) typically specifies how to do this, because the designers of those protocols knew that TCP couldn't do this itself. – David Schwartz Dec 07 '11 at 01:43
  • Whoops, my mistake. Its layer 7, then. – Rilindo Dec 07 '11 at 01:45
  • I'm not going to enable keepalives at this point, but it's handy to know the option exists. That article seems to suggest there already exists a 2 hour time out.. AFAIK, connections are being kept open for days/weeks. – Josh Smeaton Dec 07 '11 at 04:15
  • Most likely, keepalives are not enabled. Some piece of code has to enable them. It sounds like the application is just plain broken if it doesn't even enable keepalives and has no timeout/reap mechanism. What protocol are we talking about? (HTTP? SMTP? FTP?) – David Schwartz Dec 07 '11 at 04:45
  • The servers from this particular vendor implement keep-alives with a protocol they call ADDP. It just so happens that this client does not connect to this server with ADDP (it's optional). The communication protocol is a custom socket based messaging protocol for those wondering. We're logging a ticket with the vendor to fix the client app. Thanks for the direction! – Josh Smeaton Dec 07 '11 at 22:35
8

Yes it is possible. As David and Paul stated in their answers, there's no mechanism in TCP (other than TCP keep-alives, which are optional) to detect a half-open connection. It's up to the application vendor to determine the state of the connection and to take appropriate action accordingly.

As far as TCP is concerned there's no detection of or distinction between a half-open connection and a long idle connection.

You're going to have to start troubleshooting this from layer 1 (physical) of the OSI model up to layer 7 (application) to figure out where the problem is occurring. My advice would be to install and run a packet capture program on one of the affected clients until the problem occurs, and then analyze the capture to try and determine what's causing the client to not close the connection.

joeqwerty
  • 109,901
  • 6
  • 81
  • 172
5

When a workstation wants to close a connection with a server it sends a TCP FIN. If the client is not behaving properly and not closing its connections, they could in fact remain established on the server. You can set timeouts for open connections on the server to clean these up - although it would be better to find the cause. What port are the open connections coming into? Once you know what service is being accessed you might be able to identify the client app that's hitting the server.

Paul Ackerman
  • 2,729
  • 1
  • 16
  • 23
  • We know the client that is the apparent problem. It's a desktop app that hundreds of our users use daily. I'm assuming the problem is the application crashing, a hard reset, or an end-task. I thought in all of those situations that the server would be aware of the dropped connection though. – Josh Smeaton Dec 07 '11 at 02:20
  • 4
    As far as the server is concerned the connection is open unless it receives a FIN or a RST from the client. Without that, the server assumes that the connection is still established but that the client has no data to send. There's no difference between a half-open connection and an idle connection as far as the server is concerned. – joeqwerty Dec 07 '11 at 02:29
  • @joeqwerty: True, however the server *could* decide that it does not want to keep open a connection indefinitely, and could implement some timeout/close mechanism. That is what David Schwartz meant in his answer by "it is the responsibility of the application code". So the server *can* make a difference between a half-open connection and an idle connection if it wants to. To TCP however, there is indeed no difference between a half-open connection and an idle connection. – sleske Dec 07 '11 at 07:58
  • @sleske: Agreed that the application code could do this, but TCP cannot unless keep-alives are enabled. – joeqwerty Dec 07 '11 at 12:55