Is Java Native Memory Faster than the heap?

Question

I'm exploring options to help my memory-intensive application, and in doing so I came across Terracotta's BigMemory. From what I gather, they take advantage of non-garbage-collected, off-heap "native memory," and apparently this is about 10x slower than heap-storage due to serialization/deserialization issues. Prior to reading about BigMemory, I'd never heard of "native memory" outside of normal JNI. Although BigMemory is an interesting option that warrants further consideration, I'm intrigued by what could be accomplished with native memory if the serialization issue could be bypassed.

Is Java native memory faster (I think this entails ByteBuffer objects?) than traditional heap memory when there are no serialization issues (for instance if I am comparing it with a huge byte[])? Or do the vagaries of garbage collection, etc. render this question unanswerable? I know "measure it" is a common answer around here, but I'm afraid I would not set up a representative test as I don't yet know enough about how native memory works in Java.

Modern just-in-time compilers convert heap use into native stack use whenever possible anyway. Would be really hard to even test it and say "This is java heap, this is native" without forcing it to run in byte code interpreter. — Affe, May 02 '11 at 23:00

Peter Lawrey · Answer 1 · 2011-05-02T23:08:39.673

4

Direct memory is faster when performing IO because it avoid one copy of the data. However, for 95% of application you won't notice the difference.

You can store data in direct memory, however it won't be faster than storing data POJOs. (or as safe or readable or maintainable) If you are worried about GC, try creating your objects (have to be mutable) in advance and reuse them without discarding them. If you don't discard your objects, there is nothing to collect.

Is Java native memory faster (I think this entails ByteBuffer objects?) than traditional heap memory when there are no serialization issues (for instance if I am comparing it with a huge byte[])?

Direct memory can be faster than using a byte[] if you use use non bytes like int as it can read/write the whole four bytes without turning the data into bytes. However it is slower than using POJOs as it has to bounds check every access.

Or do the vagaries of garbage collection, etc. render this question unanswerable?

The speed has nothing to do with the GC. The GC only matters when creating or discard objects.

BTW: If you minimise the number of object you discard and increase your Eden size, you can prevent even minor collection occurring for a long time e.g. a whole day.

edited May 02 '11 at 23:08

answered May 02 '11 at 23:00

Peter Lawrey

525,659
79
751
1,130

"try creating your objects (have to be mutable) in advance and reuse them without discarding them." One of the best practices at Java is to do not have objects pools, why you said it? – Pih May 02 '11 at 23:28
1

@Peter Lawrey Is GC really a complete non-issue? Doing everything on the heap requires a larger heap, which I believe requires more management from the GC. Working with native memory I think means that a much smaller heap can be patrolled by the GC; unless I am mistaken. – Michael McGowan May 03 '11 at 00:25
@pih I'm not sure sure I'd say it's a best practice to not have object pools, but they do open up some risks for bugs if objects are not properly cleared and reset between usage. I believe that they should be used as appropriate, and large IO buffers, etc., can often be reused repeatedly rather than reallocating memory to them and creating them on demand. – squawknull May 03 '11 at 03:32
@Michael In generational GC, the GC cleans up a generation at a time. Once objects have moved to the tenured space, they have no impact on the young generation. (In fact a different GC is used) – Peter Lawrey May 03 '11 at 07:06
@Pih, Only the most efficient object pool actually help performance. Using an object based ring buffer for example is a simple strategy to have reusable objects. (ie it becomes a large buffer ;) ) – Peter Lawrey May 03 '11 at 07:09
1

@sqawknull, I would agree that re-using objects is more error prone than discarding objects, and should be used as approiate for you system. This might be not at all, or it could be the heart of your system. I wanted to make it clear its an option and using direct buffers is not the only one. (Actaully direct buffers is perhaps more error prone as you have no type checking) – Peter Lawrey May 03 '11 at 07:11
I still not convinced to use objects pool. All gc guides and my experience show that they do not work, optimizing the code, as expected. In case of buffers, where are nothing more than an array of bytes, it is just an object that is destroy and created jvm really fast and for the GC is also fast to remove them at the eden pool. – Pih May 03 '11 at 08:07
@Pih, So what would you call a re-usable array of mutable objects? If it works for `byte[]` why wouldn't it work for `MyPojo[]` – Peter Lawrey May 03 '11 at 08:14
An byte[] is in fact just one object, the array, which store the bytes inside it. So for the GC perspective, it is one object that should be removed from the heap during the gc. The MyPojo[] is a array of references for MyPojo, it is easy, the problem could come when the GC will have to reclain the heap space for the MyPojo instances, but creating and destrying these objects by jvm and gc it is *really* fast, so, it is not necessary to worry about a poll of them. Also, using polls, will mess up with Eden/Survivor, making in this way, the GC slower. – Pih May 03 '11 at 10:03
Since the array is recycled, it will end up in the tenured space and once there there is nothing for the GC to do with it as it is never discarded and isn't in the young generation spaces. Its only impact is on full GCs, but if these can be avoided, there is no impact at all. – Peter Lawrey May 03 '11 at 10:07

score 2 · Answer 2 · answered May 03 '11 at 03:29

The point of BigMemory is not that native memory is faster, but rather, it's to reduce the overhead of the garbage collector having to go through the effort of tracking down references to memory and cleaning it up. As your heap size increases, so do your GC intervals and CPU commitment. Depending upon the situation, this can create a sort of "glass ceiling" where the Java heap gets so big that the GC turns into a hog, taking up huge amounts of processor power each time the GC kicks in. Also, many GC algorithms require some level of locking that means nobody can do anything until that portion of the GC reference tracking algorithm finishes, though many JVM's have gotten much better at handling this. Where I work, with our app server and JVM's, we found that the "glass ceiling" is about 1.5 GB. If we try to configure the heap larger than that, the GC routine starts eating up more than 50% of total CPU time, so it's a very real cost. We've determined this through various forms of GC analysis provided by our JVM vendor.

BigMemory, on the other hand, takes a more manual approach to memory management. It reduces the overhead and sort of takes us back to having to do our own memory cleanup, as we did in C, albeit in a much simpler approach akin to a HashMap. This essentially eliminates the need for a traditional garbage collection routine, and as a result, we eliminate that overhead. I believe that the Terracotta folks used native memory via a ByteBuffer as it's an easy way to get out from under the Java garbage collector.

The following whitepaper has some good info on how they architected BigMemory and some background on the overhead of the GC: http://www.terracotta.org/resources/whitepapers/bigmemory-whitepaper.

Observe that this "glass ceiling" value is highly application dependent. We mostly use 50 to 80 GiB as heap size and don't see any significant overhead of the GC when compared to smaller setups with 20 to 30 GiB. — kap, Jan 17 '23 at 19:55

Stephen C · Answer 3 · 2011-05-03T02:33:38.710

I'm intrigued by what could be accomplished with native memory if the serialization issue could be bypassed.

I think that your question is predicated on a false assumption. AFAIK, it is impossible to bypass the serialization issue that they are talking about here. The only thing you could do would be to simplify the objects that you put into BigMemory and use custom serialization / deserialization code to reduce the overheads.

While benchmarks might give you a rough idea of the overheads, the actual overheads will be very application specific. My advice would be:

Only go down this route if you know you need to. (You will be tying your application to a particular implementation technology.)
Be prepared for some intrusive changes to your application if the data involved isn't already managed using as a cache.
Be prepared to spend some time in (re-)tuning your caching code to get good performance with BigMemory.
If your data structures are complicated, expect a proportionately larger runtime overheads and tuning effort.

Is Java Native Memory Faster than the heap?

3 Answers3