0

I have the following setup: remote devices running active zabbix_agentd (version 2.0) using socat to tunnel through an HTTPS proxy.

On the server side: Apache with a proxy service allowing CONNECT to localhost:10051 (zabbix_proxy). The connection is encrypted with SSL, requiring valid client certificate.

On the client side: Socat beta8 command line:

socat -d -d -ly "TCP-LISTEN:10051,bind=127.0.0.1,reuseaddr,fork" "PROXY:127.0.0.1:10051,connect-timeout=30 | OPENSSL:<server_domain_name>:443,connect-timeout=30,cafile=<CA_CERT_FILE>,certificate=<CLIENT_CERT_FILE>"

zabbix_agentd is configured to work in active mode only and to connect to localhost:10051

Problem: on some machines (a small minority), some of the connections don't close properly and the socat child process hangs with the TCP socket in CLOSE_WAIT state. The socket in question has the local endpoint of 127.0.0.1:10051, so it seems like the zabbix_agentd is the culprit that doesn't close the socket correctly. The hanging socat processes consume a lot of CPU cycles and eventually crash the system. The only way to clear them is with a SIGKILL signal.

Any recommendations on dealing with this problem, besides periodically killing hanging processes?

Thanks.

  • 1
    Which exact Zabbix agent version are you using? There was a related issue, https://support.zabbix.com/browse/ZBX-9251, that fixed sockets not being closed properly. The fix was released in the recent 2.0.15. – asaveljevs Sep 29 '15 at 08:49
  • I'm using 2.0.13 This is an interesting bug but is probably not related - it seems it's related to a specific item: system.hw.macaddr, which I'm not monitoring. Still, I'll upgrade the agent version and see if it helps. –  Sep 29 '15 at 12:16

0 Answers0