2

I'm debugging my code and I see my thread is being blocked in the following log4j TextEncoderHelper. I'm using log4j 2.8.2

None of my threads was able to run and it basically blocked the whole application.

Does anyone know what the below does? If I have two threads logging, does it mean its deadlock?

(I'm running with parameter

-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -DAsyncLogger.RingBufferSize=32768*32768 -DAsyncLogger.WaitStrategy=Sleep -Dlog4j2.AsyncQueueFullPolicy=Discard)

private static void copyDataToDestination(final ByteBuffer temp, final ByteBufferDestination destination) {
61          synchronized (destination) {
62              ByteBuffer destinationBuffer = destination.getByteBuffer();
63              if (destinationBuffer != temp) { // still need to write to the destination
64                  temp.flip();
65                  if (temp.remaining() > destinationBuffer.remaining()) {
66                      destinationBuffer = destination.drain(destinationBuffer);
67                  }
68                  destinationBuffer.put(temp);
69                  temp.clear();
70              }
71          }
72      }
Theboy
  • 353
  • 1
  • 2
  • 8
  • Can you do a thread dump when your gets stuck while running outside the debugger and show the results here? Or open a JIRA ticket on the Log4j2 issue tracker with the thread dump result. – Remko Popma Dec 24 '17 at 12:25
  • 1
    I saw this https://issues.apache.org/jira/browse/LOG4J2-1874 and tried with 2.10.0. The block is gone now but now the asyncLogger thread is taking a long time to call flush -> writeToDestination... What's the optimal ring buffer size such that it can basically write like a block to disk? (if that's the setting I need to play with to avoid a long flush -> writeToDestination call) – Theboy Dec 25 '17 at 10:47
  • what makes you think that the async logger thread is taking a long time to flush? Basically the async thread will automatically flush every time the queue becomes empty. The default queue size is 128 K (1K=1024). – Remko Popma Dec 25 '17 at 10:52
  • Basically, i see this in my log 07:42:36.645 - [JmsReceiveChannel] 07:42:41.259 - [JmsReceiveChannel ] I know I'm producing faster than what's printing in the log so events are getting dropped. – Theboy Dec 25 '17 at 11:11
  • In yourkit profiler, I can see it's at the writeToDestination function for a good 10 seconds. I wonder if there's any setting that I can change to avoid a long pause when queue gets overflow – Theboy Dec 25 '17 at 11:18
  • Looks to be a problem on the JMS side. Slow connection maybe? – Remko Popma Dec 25 '17 at 11:20
  • 07:42:36.490 [JmsReceiveChannel 07:42:36.490 [JmsReceiveChannel 07:42:36.491 [JmsReceiveChannel 07:42:36.491 [JmsReceiveChannel 07:42:36.511 [JmsReceiveChannel 07:42:36.511 [JmsReceiveChannel 07:42:36.512 [JmsReceiveChannel 07:42:36.512 [JmsReceiveChannel 07:42:36.512 [JmsReceiveChannel 07:42:36.512 [JmsReceiveChannel 07:42:36.564 [JmsReceiveChannel 07:42:36.564 [JmsReceiveChannel 07:42:36.645 [JmsReceiveChannel 07:42:36.645 [JmsReceiveChannel 07:42:41.259 [JmsReceiveChannel The log was printed fine for about 30 sec before log4j2 can't keep up and starts disarding.. – Theboy Dec 25 '17 at 11:27
  • The *JmsAppender* can’t keep up. For comparison, the FileAppender can handle a sustained rate of 800,000 events/second without falling behind. – Remko Popma Dec 25 '17 at 11:40
  • Can you enable JMS debug options? I suspect you may be having some disconnect/reconnect issues which is killing the throughput. – Remko Popma Dec 25 '17 at 11:42

1 Answers1

0

If the debugger shows the application is blocked trying to write to the underlying appender then perhaps the underlying appender cannot keep up with the workload.

The question doesn’t mention what appender is being used so I initially assumed the file appender but from the comments it turns out the JmsAppender is used. (Please mention details like this in future questions: without this information I was thinking in the wrong direction.)

JMS is a big subject by itself but is generally not known for being highly performant. The actual throughput that can be achieved depends on the implementation product and its configuration.

I suggest enabling JMS debug options to confirm that the JMS queue is indeed the bottleneck.

Remko Popma
  • 35,130
  • 11
  • 92
  • 114
  • Thanks for reply, I think all my problems came from here https://stackoverflow.com/questions/47971267/chronicle-queue-log4j2-async-logger/47972404#47972404 – Theboy Dec 27 '17 at 19:33