Background:
We have a vendor-supplied Java application that has a somewhat large Java Heap. Without going into too much info, the app is a black box to us, yet we feel we need to take it upon ourselves to try to tune the performace and fix problems.
The 64bit SunOS 10 machine has 16GB ram and the only non-system app that is running is the JVM for this app. The 64bit JVM runs in JBoss which I think is irrelevant to this discussion and the max heap size is 8GB, which I think is relevant.
The issue recently is that we have been getting various out of memory errors. The heap is not full when these errors occur and the error asks 'Out of Swap Space?'. The vendor wants us to just increase swap from 2GB to 4GB, This is on a system with 16GB and out app is only 8GB. We feel this will be a bad idea for performance.
My question:
So one thing we found that was the file caching uses up all the remaining free memory to increase performance. Normally not a problem, but it apparently fragments the memory. As the Hotspot JVM requires contiguous memory space, we have understood that this memory fragmentation results in the use of the swap space that is not fragmented.
However, I am not sure if I understand the relationship between the fragmentation and the requirement of contiguous memory. Surely the fragmentation is just referring to fragmentation of the physical ram. With virtual memory, it is entirely possible to allocate a contiguous chunk of ram without it being backed by a contiguous chunk of ram. In other words, a non-contiguous chunk of physical memory would appear to a running process as a contiguous chunk of virtual memory.
So, I guess, there was no one sentence question in there, but does anyone know more on this subject and can chime in? Any links that refer to this contiguous memory issue on 64 bit systems?
What I found so far:
So far, every reference I have found to the 'contiguous memory' problem has been more related to how the virtual address space is laid out in 32bit address systems. As we are running a 64 bit system (with, I think, 48 bit addressing), there is plenty of virtual address space to allocate large contiguous chunks.
I have been looking all over the internet for this information but have been unable, thus far, to find the information I am looking for.
Updates:
- To be clear, I was not trying to get an answer to why I was getting OOM errors, but rather trying to understand the relationship between possibly fragmented system RAM and the contiguous chunk of virtual memory needed by java.
- prstat -Z
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
0 75 4270M 3855M 24% 92:24:03 0.3% global
- echo "::memstat" | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 326177 2548 16%
ZFS File Data 980558 7660 48%
Anon 561287 4385 27%
Exec and libs 12196 95 1%
Page cache 17849 139 1%
Free (cachelist) 4023 31 0%
Free (freelist) 156064 1219 8%
Total 2058154 16079
Physical 2042090 15953
Where I previously thought that the ZFS File Data was memory that is freely available, I have since learned that this is not the case and could well be the cause for errors.
vmstat 5 5
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr vc vc vc -- in sy cs us sy id
0 0 0 2161320 2831768 12 55 0 0 0 0 0 3 4 -0 0 1089 1320 1048 1 1 98
0 0 0 819720 1505856 0 14 0 0 0 0 0 4 0 0 0 1189 748 1307 1 0 99
0 0 0 819456 1505648 0 1 0 0 0 0 0 0 0 0 0 1024 729 1108 0 0 99
0 0 0 819456 1505648 0 1 0 0 0 0 0 0 0 0 0 879 648 899 0 0 99
0 0 0 819416 1505608 0 1 0 0 0 0 0 0 3 0 0 1000 688 1055 0 0 99
These command outputs were taken when the application was running in a healthy state. We are now monitoring all of the above and logging it in case we see the swap space errors again.
The following is after the JVM had grown to 8GB and then was restarted. The effect of this is that the ZFS ARC has shrunk (to 26% RAM) until it grows again. How do things look now?
vmstat 5 5
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr vc vc -- -- in sy cs us sy id
0 0 0 1372568 2749528 11 41 0 0 0 0 0 2 3 0 0 713 418 539 0 0 99
0 0 0 3836576 4648888 140 228 0 0 0 0 0 0 0 0 0 1178 5344 1117 3 2 95
0 0 0 3840448 4653744 16 45 0 0 0 0 0 0 0 0 0 1070 1013 952 1 3 96
0 0 0 3839168 4652720 6 53 0 0 0 0 0 0 0 0 0 564 575 313 0 6 93
0 0 0 3840208 4653752 7 68 0 0 0 0 0 3 0 0 0 1284 1014 1264 1 1 98
- swap -s
total: 4341344k bytes allocated + 675384k reserved = 5016728k used, 3840880k available