I apologise, I'm relatively new to Java. I'm writing a library to send logs from Java applications to Logstash using zeromq (jeromq).
One of my test servers is a busy Jenkins master running in Tomcat. My library, "Logit" (https://github.com/stuart-warren/logit), sprays its JSON formatted logs (JUL) over a configured list of Logstash endpoints (for some basic load-balancing) using the PUSHPULL sockets for speed.
Unfortunately, after a few days it hits an exception. Tomcat/Jenkins continue working and writing logs to their normal files, but Logit stops sending messages over the network.
18-Nov-2013 14:45:24 hudson.model.Run execute
INFO: EmployeeManagementSystems » Project Phase 2 branch » Int Tests » Sync Int Tests (Project - Prl) #326 aborted
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at hudson.remoting.Request.call(Request.java:146)
at hudson.remoting.Channel.call(Channel.java:714)
at hudson.maven.ProcessCache$MavenProcess.call(ProcessCache.java:156)
at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:815)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:565)
at hudson.model.Run.execute(Run.java:1592)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:508)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:237)
logit:WARN IOException thrown, will try sending log again shortly.
zmq.ZError$IOException: java.nio.channels.ClosedByInterruptException
at zmq.Signaler.send(Signaler.java:108)
at zmq.Mailbox.send(Mailbox.java:90)
at zmq.Ctx.send_command(Ctx.java:351)
at zmq.ZObject.send_command(ZObject.java:364)
at zmq.ZObject.send_activate_read(ZObject.java:217)
at zmq.Pipe.flush(Pipe.java:284)
at zmq.LB.send(LB.java:120)
at zmq.Push.xsend(Push.java:64)
at zmq.SocketBase.send(SocketBase.java:598)
at org.jeromq.ZMQ$Socket.send(ZMQ.java:932)
at com.stuartwarren.logit.zmq.ZmqTransport.appendString(ZmqTransport.java:115)
at com.stuartwarren.logit.jul.ZmqAppender.publish(ZmqAppender.java:77)
at java.util.logging.Logger.log(Logger.java:481)
at java.util.logging.Logger.doLog(Logger.java:503)
at java.util.logging.Logger.log(Logger.java:592)
at hudson.model.Run.execute(Run.java:1610)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:508)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:237)
Caused by: java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
at sun.nio.ch.SinkChannelImpl.write(SinkChannelImpl.java:154)
at zmq.Signaler.send(Signaler.java:106)
... 18 more
logit:ERROR Logit got interrupted while waiting to send failed message again.
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at com.stuartwarren.logit.zmq.ZmqTransport.appendString(ZmqTransport.java:121)
at com.stuartwarren.logit.jul.ZmqAppender.publish(ZmqAppender.java:77)
at java.util.logging.Logger.log(Logger.java:481)
at java.util.logging.Logger.doLog(Logger.java:503)
at java.util.logging.Logger.log(Logger.java:592)
at hudson.model.Run.execute(Run.java:1610)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:508)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:237)
18-Nov-2013 14:45:35 hudson.model.Run execute
I'm using the same version of my library logging from a test 6 node Cassandra cluster without issue (log4j), but I guess that could just be luck?
My question is, how should one handle exceptions like above when sending messages? https://github.com/stuart-warren/logit/blob/master/src/main/java/com/stuartwarren/logit/zmq/ZmqTransport.java#L115.
public void appendString(final String line) {
final String log = line.substring(0, line.length() - 1);
if (LogitLog.isTraceEnabled()) {
LogitLog.trace("Sending log: [" + log + "].");
}
try {
socket.send(log, ZMQ.NOBLOCK);
// Has occasionally been known to throw a java.nio.channels.ClosedByInterruptException
} catch (IOException e) {
LogitLog.warn("IOException thrown, will try sending log again shortly.", e);
// Try again after sleeping for a second
try {
Thread.sleep(1000);
socket.send(log, ZMQ.NOBLOCK);
} catch (InterruptedException i) {
LogitLog.error("Logit got interrupted while waiting to send failed message again.", i);
} catch (IOException e2) {
LogitLog.error("Could not send following log on the second attempt: [" + log + "].", e2);
}
} catch (Exception g) {
LogitLog.error("Something threw an exception that wasn't IOException.", g);
}
}
Should I close and reopen the sockets/context, or should this possibly be handled in the Jeromq library? Perhaps Tomcat is more likely to throw these and it should be handled in my JUL appender?
Thanks.