0

I am a very newbie for AIX and system monitoring. Actually our application currently run production on jboss 5.1 in AIX 5.3. Please check below configuration & system settings.

  1. AIX system configuration
    • OS Level 5.3.9.0 (oslevel -g)
    • Physical Memory size 24GB (svmon -G)
    • Page space 4GB (lsps -s)
    • processors 3 cores, Processor Type: PowerPC_POWER6, Processor Clock Speed: 4704 MHz (prtconf | grep Processor)
  2. Java version
    • JRE 1.6.0 IBM AIX build pap6460sr10fp1-20120321_01 (SR10 FP1) (java -fullversion)
  3. JBoss configuration
    • JBoss 5.1/JBoss ESB 4.11
    • Hornetq messaging with consumer flow control
    • java opts : -d64 -Xms2g -Xmx4g -XX:MaxPermSize=1024m

Sometime we observe very strange behavior in the JBoss that freeze without any error logs. Also server log stop without any further trace. We also not able to get thread dump (kill -3) and its not generate at that point. (kill -3 xxxxx works in normal circumstances) Only option available for us was restart the jboss server and its seem all messages that were in queues during the freeze time process after restarting.

We try tweak some of setting in JBoss hornetq, we though issue was there. Hornetq Stuck By Default. But we haven't any luck and also unable to isolate the issue in any point. We looking at tool like nmon for monitoring this but no clue is that good enough to do so.

Please provide some point to investigate this issue.

Thanks

growse
  • 8,020
  • 13
  • 74
  • 115
jess
  • 1

2 Answers2

1

1. Check full coredumps

You need to check if fullcore is activated on your AIX:

lsattr -Elsys0 | grep full

To enable fullcore:

chdev -l sys0 -a fullcore=true

2. Check limits

Fsize and core limits need to set to unlimited.

ulimit -c unlimited
ulimit -f unlimited
jmlrt
  • 83
  • 6
0

The first thing to look at is Heap Space exhaustion. Enable verbose Garbage Collection by adding the following to the Java opts.

-verbose:gc

Either manually view the output or analyse it with http://www.tagtraum.com/gcviewer.html.

From what I remember the AIX JDK is very different to the Sun JDK including the GC strategy so one needs to be careful when reading up on JDKs.

In a healthy app you will see the used heap space increase, then when a Full GC takes place the used space should drop significantly. This will be repeated throughout the lifetime of the app.

In an unhealthy app, the memory drop after Full GC will be a little less each time so that over time the used heap space will slowly climb higher and higher until there's no more memory available. At this point it's common for the JDK to become unresponsive as it's constantly running the Garbage collector to try to free memory.

If this is occurring, then you will have to look at your Web app. It's not uncommon for devs to cache objects in memory but not use any cap for the number of objects.

TIP: Set -Xms to the same size as -Xmx. Allocating more memory from the Operating System to grant to the heap is an expensive operation. In a server environment, this makes little sense as you would always want enough real memory for the Heap.

Alastair McCormack
  • 2,184
  • 1
  • 15
  • 22