I am facing a strange issue with ZMQ, which I'm just not able to debug. These are the components:
- Java ZMQ Server - Almost an exact copy of this example. There are a hundred worker threads.
PHP Client - Simple request reply with a REQ socket. This is the request flow:
$zcontext = new ZMQContext(); $socket = new ZMQSocket($zcontext, ZMQ::SOCKET_REQ); $socket->connect(<address>); $startTime = microtime(true); $socket->send(<request>); $result = $socket->recv(); $totalTime = microtime(true) - $startTime;
The ZMQ sockets use TCP and both the server and client are on the same machine.
The PHP script is served by apache and I am load testing using apache benchmark. I make 5000 requests with a concurrency of 200. On the PHP client I log the time it takes for the request reply ($totalTime
). In most of the cases, this time is very low (sub 500ms), but occasionally it takes a really long time - sometimes even 60 secs (for send + receive).
I added some extra logging to find out where the issue is happening, and it turns out that whenever it takes really long, almost all the time is between PHP's send and Java's receive - so packets are taking really long to reach the server.
I'm not setting any special ZMQ settings, or otherwise doing anything unusual so I don't know what is causing the issue. It should also be noted that the issue persists even at lower concurrencies (I tested at 100 and 150 too), but the max request times are lower.
Sorry if the question seems vague - I'll provide any other details that may be needed.