11

All!

I found strange code in LinkedBlockingQueue:

private E dequeue() {
        // assert takeLock.isHeldByCurrentThread();
        Node<E> h = head;
        Node<E> first = h.next;
        h.next = h; // help GC
        head = first;
        E x = first.item;
        first.item = null;
        return x;
}

Who can explain why do we need local variable h? How can it help for GC?

Vadzim
  • 24,954
  • 11
  • 143
  • 151
Alex Guzanov
  • 216
  • 1
  • 7
  • maybe the comment is outdated (can you find the version control and track down the comment for the commit?) :) – milan Jan 11 '12 at 13:00
  • 2
    It's interesting that this temp var and comment were added only in java 7. So it was definitively by some intention. See http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/LinkedBlockingQueue.java#LinkedBlockingQueue.extract%28%29 and http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7-b147/java/util/concurrent/LinkedBlockingQueue.java#LinkedBlockingQueue.dequeue%28%29 – Vadzim Jan 11 '12 at 13:34
  • that's openjdk, LinkedBlockingQueue in the Oracle's jdk 6 on Mac (1.6.0_29-b11-402.jdk) has the temp variable. – milan Jan 11 '12 at 15:52
  • At least it certainly wasn't there in java 5. – Vadzim Jan 12 '12 at 18:00

4 Answers4

7

If you look at the jsr166 src then you will find the offending commit, scroll down to v1.51

This shows the answer is in this bug report

The full discussion can be found in a jsr.166 mailing list thread

The "helping GC" bit is about avoiding things bleeding into tenured.

Matt
  • 8,367
  • 4
  • 31
  • 61
5

Maybe a bit late, but the current explanation is completely unsatisfactory to me and I think I've got a more sensible explanation.

First of all every java GC does some kind of tracing from a root set one way or another. This means that if the old head is collected we won't read the next variable anyhow - there's no reason to do so. Hence IF head is collected in the next iteration it doesn't matter.

The IF in the above sentence is the important part here. The difference between setting next to something different doesn't matter for collecting head itself, but may make a difference for other objects.

Let's assume a simple generational GC: If head is in the young set, it will be collected in the next GC anyhow. But if it's in the old set it will only be collected when we do a full GC which happens rarely.

So what happens if head is in the old set and we do a young GC? In this case the JVM assumes that every object in the old heap is still alive and adds every reference from old to young objects to the root set for the young GC. And that's exactly what the assignment avoids here: Writing into the old heap is generally protected with a write barrier or something so that the JVM can catch such assignments and handle them correctly - in our case it removes the object next pointed to from the root set which does have consequences.

Short example:

Assume we have 1 (old) -> 2 (young) -> 3 (xx). If we remove 1 and 2 now from our list, we may expect that both elements would be collected by the next GC. But if only a young GC occurs and we have NOT removed the next pointer in old, both elements 1 and 2 won't be collected. Contrary to this if we have removed the pointer in 1, 2 will be collected by the young GC..

Voo
  • 29,040
  • 11
  • 82
  • 156
  • I really don't see any difference between what you are saying and what I said. I've also said that not removing the next may cause the GC to collect the head faster. – Tudor Jan 14 '12 at 08:08
  • 1
    @Tudor Well I explained which mechanism would cause the change in behavior and under which circumstances it would happen, which certainly seems a bit more substantial to me than "something could cause the GC to act differently". Also it's not about head being collected earlier (as that's obviously nonsense), but about subsequent elements - different objects those are. – Voo Jan 14 '12 at 13:00
  • I've stated in my comment that in the absence of the actual implementation, I can only speculate about what that assignment is supposed to do. You, on the other hand, have made quite a few absolute statements without actually providing any source and used terms like "young GC" without actually introducing them. – Tudor Jan 14 '12 at 13:05
  • 1
    @Tudor Yeah I'm assuming some basic knowledge about how Java garbage collectors (well actually any generational GC) work. In a question that's about some fine-tuned optimization for GCs that seems not unreasonable - otherwise any post about GC specifics would get pages long. Is it limiting to assume a generational GC? To some degree sure, but every production GC I know of (azul, the usual ones in hotspot and ibm - MS does so for .NET as well) has that notion of generations in one way or the other for obvious reasons. – Voo Jan 14 '12 at 13:21
0

To better understand what happens let's see how the list looks like after executing the code. First consider an initial list:

1 -> 2 -> 3

Then h points to head and first to h.next:

1 -> 2 -> 3
|    |
h    first

Then h.next points to h and head points to first:

1 -> 2 -> 3
|   / \
h head first

Now, practically we know that there is only active reference pointing to the first element, which is by itself (h.next = h), and we also know that the GC collects objects that have no more active references, so when the method ends, the (old) head of the list ca be safely collected by the GC since h exists only within the scope of the method.

Having said this, it was pointed out, and I agree with this, that even with the classic dequeue method (i.e. just make first point to head.next and head point to first) there are no more references pointing towards the old head. However, in this scenario, the old head is left dangling in memory and still has its next field pointing towards first, whereas in the code you posted, the only thing left is an isolated object pointing to itself. This may be triggering the GC to act faster.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
Tudor
  • 61,523
  • 12
  • 102
  • 142
  • Downvoting cause h would have no more references anyway even without `h.next = h`. – Vadzim Jan 11 '12 at 12:56
  • 1
    @Vadzim: I was not stating that, I was just explaining what happens in the code he posted. – Tudor Jan 11 '12 at 12:57
  • The question is not about that. It's about why not just `first = head = head.next`. But thanks for graphics anyway. Note that head always doesn't contain first item itself. – Vadzim Jan 11 '12 at 13:03
  • @Vadzim: I'm thinking it may be an optimization in the sense that if you use h like this, then by the end of the method scope you definitely have no more links from and to the head, whereas with the classic dequeue it's just left dangling in memory with a reference towards reachable objects. – Tudor Jan 11 '12 at 13:15
  • 2
    Why having reference towards reachable objects can bother GC? – Vadzim Jan 11 '12 at 13:21
  • 1
    @Vadzim: theoretically it doesn't, but until I see the actual implementation of the GC, I think I can assume that any kind of subtle difference may affect the speed at which it decides to collect and object. – Tudor Jan 11 '12 at 13:24
  • @Tudor you may be interested in the explanation I came up with - it doesn't have anything to do with the head variable itself, because that will be collected completely independent of what its `next` is pointing to - basic tracing GC 101. – Voo Jan 14 '12 at 03:32
0

Here's a code sample that illustrates the question: http://pastebin.com/zTsLpUpq. Performing a GC after runWith() and taking a heap dump for both versions says there's only one Item instance.

milan
  • 11,872
  • 3
  • 42
  • 49