2

I run an Apache2 server which uses the Shibboleth daemon (shibd) as federated authentication module. Certain server connections using Shibboleth seem to stick permanently in CLOSE_WAIT state.

tcp       38      0 blah.blah:57346 shib.server.:8443 CLOSE_WAIT 
tcp       38      0 blah.blah:45601 shib.server2:8443 CLOSE_WAIT
tcp       38      0 blah.blah:41737 shib.server3:5057 CLOSE_WAIT 

From what I can find out, CLOSE_WAIT means that when the remote server disconnects, the local application is failing to close the connection, as it should. I suspect shibd is responsible somehow.

Needless to say, if enough CLOSE_WAIT connections accumulate, I have a problem.

Trying to get rid of the CLOSE_WAIT connections by simply using

/etc/init.d/networking restart

does not work. In fact networking seems to refuse to close down and restart, and I get a SIOCADDRT: File exists error (ie networking is trying to start without having stopped first). Same problem with ifup -a

So I have two questions - one may be easy, and one harder.

  1. What's a good way to force networking to restart, and force whatever connections are stuck in CLOSE_WAIT to clear?
  2. Any ideas about how to fix shibboleth and force shibd module to behave?
RJT
  • 21
  • 1

2 Answers2

1

The answer to 1, unfortunately, is to restart the process that still has references to the connections. Nothing else will force it to close them.

David Schwartz
  • 31,449
  • 2
  • 55
  • 84
0

Eight years, seven months and a different Stack Exchange account later, and shibd (now in a new version) still has this behaviour.

The best, but entirely cludgy, way around the problem is to use a crontab about once a day to run

service shibd restart

In the past this was itself a headache, as large metadata files meant shibd took many minutes to reload. The current version of shibd allows for 'as needed' loading of metadata from a remote host, which means reloads are now less problematic.

fred2
  • 97
  • 9