0

I am running Netcat on an Ubuntu 18.04.3 LTS server, to listen on port 469. This server gets frequent TCP requests from other machines to port 469, which I use to monitor the server for uptime. I start Netcat with:

nc -kl 469

and I can see the process is live with:

$ ps -aux | grep 469 which produces this output:

root 11041 0.0 0.1 13596 1060 ? S Aug31 0:21 nc -kl 469`

This system works well for about 24 - 28 hours, but then Netcat stops responding. After investigating, I believe the issue is that the Recv-Q buffer "fills up". Normally, the Recv-Q buffer is zero up until the point where Netcat stops responding. After it has stopped responding, the Recv-Q buffer is a constant 2 (instead of normal 0). I can check this with "ss" as follows.

$ ss -tnl and then I see this, where the abnormal Recv-Q of 2 is visible.

$ ss -tnl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 64 0.0.0.0:42587 0.0.0.0:*
LISTEN 0 128 0.0.0.0:46663 0.0.0.0:*
LISTEN 0 128 0.0.0.0:111 0.0.0.0:*
LISTEN 2 1 0.0.0.0:469 0.0.0.0:*
LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 64 [::]:44057 [::]:*
LISTEN 0 128 [::]:55085 [::]:*
LISTEN 0 128 [::]:111 [::]:*

We have several other Ubuntu servers, that are also running Netcat listening on port 469, in exactly the same way. They do NOT fail - they have been up for weeks. But this server fails again and again, even after restart, and always after around 24+ hours. The only difference between this and the other servers (that I can think of) is that this server also has an nfs volume mounted (as can be seen from the listening to port 111 above).

What could it be that causes this? Can I clear the Recv-Q somehow (from bash), so that I could clear it with some regular interval (as a temporary fix)? Any help is much appreciated.

Prospero
  • 1
  • 2

1 Answers1

0

I have now found the answer to this question, and I wanted to post it here, just in case it can help anyone else.

The issue was that the server also had Strongswan installed, and that the incoming TCP requests to port 469 came over an IPSec connection from another server. What happened was that the IPSEC connection was sometimes interrupted for a very short time when the IPSec connection was re-keying (about every 24 hours). If that happened just in the middle of an ongoing TCP Syn/Ack to port 469, it would leave it in limbo. And so the "nc -kl" would get stuck, with packets in the Recv-Q buffer.

The solution was to configure Strongswan so that the re-keying happened without interruption. The learning was that rather than searching for a problem with the Recv-Q buffer or Netcat, it was necessary to understand the root cause - which in this case was understanding why the TCP Syn/Ack could not complete.

Prospero
  • 1
  • 2