Why do we need to manually handle reference counting for Netty ByteBuf if JVM GC is still in place?

Question

According to the book Netty in Action v10, reference counting is used to handle the pooling of ByteBuf. But JVM is not aware of the netty reference counting, so JVM can still GC the ByteBuf. If so, why do we still need to care about the reference counting and manually call release() method?

I quoted some from the book < Netty in Action v10 > to add some context.

One of the tradeoffs of reference-counting is that the user have to be carefully when consume messages. While the JVM will still be able to GC such a message (as it is not aware of the reference-counting) this message will not be put back in the pool from which it may be obtained before. Thus chances are good that you will run out of resources at one point if you do not carefully release these messages.

And some related threads: Buffer ownership in Netty 4: How is buffer life-cycle managed?

https://blog.twitter.com/2013/netty-4-at-twitter-reduced-gc-overhead

ADD 1

(Below is some more of my understanding.)

A ByteBuf can be categorized from 2 perspectives:

1. Pooled or Unpooled
2. Heap-based or Direct

So there can be 4 combinations:

(a) Pooled Heap-based
(b) Pooled Direct
(c) Unpooled Heap-based
(d) Unpooled Direct

Only (a) and (c) are affected by JVM GC mechanism because they are heap-based.

In the above quotation from < Netty in Action v10 >, I think the message means a Java object, which is in (a) category.

One ultimate rule is, if a Java object is GCed, it's totally gone. So below is what I think Netty does:

For (a), Netty allocator MUST trick JVM GC into believing the object should never be GCed. And then use ref counting to move the object out of/back into the pool. This is another form of life cycle.
For (b), JVM GC is not involved as it is not JVM Heap-based. And Netty allocator need to use ref counting to move the object out of/back into the pool.
For (c), JVM GC takes full responsibility to control the life of object. Netty allocator just provide API for allocating object.
For (d), JVM GC is not involved. And no pooling is needed. So Netty allocator only needs provide API for allocating/releasing the object.

for performance, read this: https://groups.google.com/forum/#!topic/netty/CnGCvnyHCXQ — jean, Mar 22 '17 at 11:05

score 28 · Accepted Answer · edited May 23 '17 at 12:34

Direct buffers are indirectly freed by the Garbage Collector. I will let you read through the answer to this question to understand how that happens: Are Java DirectByteBuffer wrappers garbage collected?

Heap buffers need to be copied to the direct memory before being handled by the kernel, when you're performing I/O operations. When you use direct buffers you save that copy operation and that's the main advantage of using direct buffers. A drawback is that direct memory allocation is reasonably more expensive then allocations from the java heap, so Netty introduced the pooling concept.

Pooling objects in Java is a polemic topic, but the Netty choice for doing so seems to have paid off and the Twitter article you cited shows some evidence of that. For the particular case of allocating buffers, when the size of the buffer is large, you can see that it really brings a benefit both in the direct and heap buffer cases.

Now for pooling, the GC doesn't reclaim the buffer when they are pooled, because either your application has one or several references to it, while you're using the buffer; or Netty's pool has a reference to it, when it has just been allocated and has not yet been given to your application or after your application used it and gave it back to the pool.

Leaks will happen when your application, after having used a buffer and not keeping any further reference to it, doesn't call release(), what actually means put it back into the pool, if you don't have any further reference to it. In such case, the buffer will eventually be garbage collected, but Netty's pool won't know about it. The pool will then grow believing that you're using more and more buffers that are never returned to the pool. That will probably generate a memory leak because, even if the buffers themselves are garbage collected, the internal data structures used to store the pool will not.

Struggling myself with what looks like a Netty buffer leak right now, I'm wondering: why couldn't we just add a finalize() method to the abstract ByteBuf class and then have that method call release()? It seems that would solve problem, but I'm probably overlooking something here? — raner, Apr 04 '18 at 00:28
Finalizers are often dangerous: http://www.informit.com/articles/article.aspx?p=1216151&seqNum=7 — Leo Gomes, Apr 04 '18 at 15:14
@LeoGomes: "In such case, the buffer will eventually be garbage collected, but Netty's pool won't know about it." How can the buffer be garbage collected when the pool is holding a reference to it? — pippoflow, Aug 19 '19 at 09:27

score 2 · Answer 2 · answered Feb 22 '15 at 13:21

2

ByteBuf is using off heap memory , therefore its not visible to GC. Thats why you need to update the reference count ( otherwise netty will not know when to release the item ). Best regards

answered Feb 22 '15 at 13:21

Yulian Oifa

111
7

Thanks for reply but I don't think so. ByteBuf can be heap-based (I think the heap means JVM GC heap), or direct buffer (not heap-based via native calls) or composite buffer. – smwikipedia Feb 22 '15 at 13:27
Thanks. That link is helpful. I need some time to understand it. Anyway, I don't think it's good idea to expose the reference count details to Netty user. – smwikipedia Feb 22 '15 at 14:19
According to the wiki page you posted: *"Since Netty version 4, the life cycle of certain objects are managed by their reference counts rather than the garbage collector. "* How could Netty overthrown the JVM GC mechanism? – smwikipedia Feb 22 '15 at 14:31
1

Hello It can use native libraries which in turn uses malloc , etc functions to get direct access to memory. Since jvm can not control that , this memory becomes off heap allocated. Best regards – Yulian Oifa Feb 22 '15 at 14:40
I guess I am starting to catch you. I read this "Because we cannot rely on GC to put the unused buffers into the pool, we have to be very careful about leaks. Even a single handler that forgets to release a buffer can make our server’s memory usage grow boundlessly." from https://blog.twitter.com/2013/netty-4-at-twitter-reduced-gc-overhead – smwikipedia Feb 22 '15 at 14:47
1

It seems `ByteBufAllocator` implements its own memory allocation mechanism. So the memory it allocates should ALL be out control of JVM GC. But I read this in the netty book *"The most frequently used ByteBuf pattern stores the data in the heap space of the JVM"*. How to understand this? Does this mean `ByteBufAllocator` can also allocate memory from JVM heap? – smwikipedia Feb 22 '15 at 15:01
1

Hello Yes it can use both off heap and in heap memory.However in heap memory does not causes any problem since its controlled by gc, while off heap memory does not have any control, so if you forget to release it you get out of memory some time. Best regards – Yulian Oifa Feb 22 '15 at 16:10

Why do we need to manually handle reference counting for Netty ByteBuf if JVM GC is still in place?

ADD 1

2 Answers2

Linked