0

I am getting this exception:

 org.zeromq.ZMQException: Errno 4
        at org.zeromq.ZMQ$Socket.mayRaise(ZMQ.java:3732) ~[jeromq-0.5.3.jar:na]
        at org.zeromq.ZMQ$Socket.recv(ZMQ.java:3530) ~[jeromq-0.5.3.jar:na]
        at com.forexassistant.service.zeromq.CurrencyStrengthZeroMQ.sendCurrencyStrengthRequest(CurrencyStrengthZeroMQ.java:30) ~[classes/:na]
        at com.forexassistant.service.algorithmlogic.AlgorithmLogic.getCurrencyStrength(AlgorithmLogic.java:209) ~[classes/:na]
        at sun.reflect.GeneratedMethodAccessor111.invoke(Unknown Source) ~[na:na]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_241]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_241]
        at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84) ~[spring-context-5.3.22.jar:5.3.22]
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-5.3.22.jar:5.3.22]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_241]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_241]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_241]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_241]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_241]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_241]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_241]
        Suppressed: org.zeromq.ZMQException: Errno 4
            at zmq.Ctx.terminate(Ctx.java:304) ~[jeromq-0.5.3.jar:na]
            at org.zeromq.ZMQ$Context.term(ZMQ.java:671) ~[jeromq-0.5.3.jar:na]
            at org.zeromq.ZContext.destroy(ZContext.java:136) ~[jeromq-0.5.3.jar:na]
            at org.zeromq.ZContext.close(ZContext.java:463) ~[jeromq-0.5.3.jar:na]
            at com.forexassistant.service.zeromq.CurrencyStrengthZeroMQ.sendCurrencyStrengthRequest(CurrencyStrengthZeroMQ.java:37) ~[classes/:na]
            ... 13 common frames omitted

I have no idea why it is happening, the logic in question is this that communicates with the socket:

public String sendCurrencyStrengthRequest() {
        try (ZContext context = new ZContext()) {
            ZMQ.Socket socket = context.createSocket(SocketType.PUSH);
            socket.connect("tcp://localhost:32868");
            ZMQ.Socket socket2 = context.createSocket(SocketType.PULL);
            socket2.connect("tcp://localhost:32869"); 
            
            String msg = "GET_CURRENCY_STRENGTHS";
            socket.send(msg,1);
            
        
            while (!Thread.currentThread().isInterrupted()) {
                 byte[] reply = socket2.recv(0);
                 if(reply!=null) {
                     currencyStrengths= new String(reply, ZMQ.CHARSET);
                 }
                 context.close();
                 break;      
            }
        }
        return currencyStrengths;
    }

Am I doing this wrong? sendCurrencyStrengthRequest() is scheduled in Spring to be called every 5 seconds and there is another function that will be called every 30mins that uses a different pull and push socket within a different context, all this works for a while and then this error gets thrown, any idea?

jose1278
  • 3
  • 3
  • Errno 4 is `EINTR`, an interrupted system call. That means your program receives a signal while calling a system function. Apparently when trying to receive data from the socket. Probably your `recv()` call is blocking waiting for data when it is interrupted. A solution is to retry the interrupted call. – rveerd Jul 14 '23 at 14:52

1 Answers1

0

I'm assuming that, what's not shown, is the other program paired with this serving as the other endpoints for the PUSH and PULL sockets that you create and connect.

I think that the problem lies in your other program that we're not seeing. Is that receiving the GET_CURRENCY_STRENGTHS message, replying, and then immediately terminating, or immediately cleaning up its context in much the same way as this code snippet is?

If so, it is the immediate termination / clean up that is the problem. The act of sending a message using ZMQ is non blocking; you're just pushing the message on to a queue that's managed by the thread(s) that ZMQ starts up and manages in the background. If after a ZMQ send() you immediately terminate the program or clean up, that means in all likelihood the management threads haven't actually had any time to do anything at all - maybe they've not even been scheduled by the OS at this point.

The result of the program terminating is that the OS cleans up whatever resources have not yet been cleaned up by the program itself - sockets, allocated memory, threads, the lot.

The fact that this is all on localhost is interesting, because the OS has full visibility of the underlying tcp socket and can tell the connecting end (this program) a lot more about the state of the tcp socket than if the other endpoints were on another computer.

If this is the case, at other end of the connections (in the code snippet you've given us), what's happening is that ZMQ is patiently waiting in a blocked system call for something to come in via a tcp socket. Except, that socket is getting torn down by the OS (or the other end cleaning up). So, the socket read is terminated (because the socket no longer exists) and you get an ugly exception as shown.

At least, that's my guess. If that's correct, try putting in some delay in the other program (that we've not seen) between the zmq send() and the termination of the program, allow things to happen in the background.

The variability in whether your code succeeds or not would be down to the random nature of the threads being scheduled or not following the zmq send(). The communication between the application program and the zme management thread(s) is done via IPC pipes (or other things like semaphores), all of which give the OS an opportunity to re-schedule threads. Sometimes it might say one thread gets scheduled, another time it may not, depending on how much of a time slice the main application thread has had (and a heap of other mysterious, arcane factors).

In General

In ZMQ in particular, and in other Actor Model systems in general, termination is something that has to be agreed on. Because it's buffering messages inside the transport, you've no idea whether "that last, final message" has made it through to the receiving end and that it is safe to terminate. What you need to have is a either a time delay before closing to allow everything to settle down to quiescence, or some explicit messaging acknowledging that the final message has propagated to everywhere.

bazza
  • 7,580
  • 15
  • 22