3

I decided to continue https://stackoverflow.com/a/41998907/2674303 in a separated topic.

Let's consider following example:

public class SimpleGCExample {
    public static void main(String[] args) throws InterruptedException {
        ReferenceQueue<Object> queue=new ReferenceQueue<>();
        SimpleGCExample e = new SimpleGCExample();
        Reference<Object> pRef=new PhantomReference<>(e, queue),
                wRef=new WeakReference<>(e, queue);
        e = null;
        for(int count=0, collected=0; collected<2; ) {
            Reference ref=queue.remove(100);
            if(ref==null) {
                System.gc();
                count++;
            }
            else {
                collected++;
                System.out.println((ref==wRef? "weak": "phantom")
                        +" reference enqueued after "+count+" gc polls");
            }
        }
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("finalizing the object in "+Thread.currentThread());
        Thread.sleep(100);
        System.out.println("done finalizing.");
    }
}

Java 11 prints following:

finalizing the object in Thread[Finalizer,8,system]
weak reference enqueued after 1 gc polls
done finalizing.
phantom reference enqueued after 3 gc polls

First 2 rows can change order. Looks like they work in parallel.

Last row sometimes prints 2 gc polls and sometimes 3

So I see that enqueing of PhantomReference takes more GC cycles. How to explain it? Is it mentioned somewhere in documentation(I can't find)?

P.S.

WeakReference java doc:

Suppose that the garbage collector determines at a certain point in time that an object is weakly reachable. At that time it will atomically clear all weak references to that object and all weak references to any other weakly-reachable objects from which that object is reachable through a chain of strong and soft references. At the same time it will declare all of the formerly weakly-reachable objects to be finalizable. At the same time or at some later time it will enqueue those newly-cleared weak references that are registered with reference queues

PhantomReference java doc:

Suppose the garbage collector determines at a certain point in time that an object is phantom reachable. At that time it will atomically clear all phantom references to that object and all phantom references to any other phantom-reachable objects from which that object is reachable. At the same time or at some later time it will enqueue those newly-cleared phantom references that are registered with reference queues

Difference is not clear for me

P.S.(we are speaking about object with non-trivial finalize method)

I got answer to my question from @Holger:

He(no sexism but I suppose so) pointed me to the java doc and noticed that PhantomReference contains extra phrase in comparison with Soft and Weak References:

An object is weakly reachable if it is neither strongly nor softly reachable but can be reached by traversing a weak reference. When the weak references to a weakly-reachable object are cleared, the object becomes eligible for finalization.
An object is phantom reachable if it is neither strongly, softly, nor weakly reachable, it has been finalized, and some phantom reference refers to it

My next question was about what does it mean it has been finalized I expected that it means that finalize method was finished

To prove it I modified application like this:

public class SimpleGCExample {
    static SimpleGCExample object;

    public static void main(String[] args) throws InterruptedException {
        ReferenceQueue<Object> queue = new ReferenceQueue<>();
        SimpleGCExample e = new SimpleGCExample();
        Reference<Object> pRef = new PhantomReference<>(e, queue),
                wRef = new WeakReference<>(e, queue);
        e = null;
        for (int count = 0, collected = 0; collected < 2; ) {
            Reference ref = queue.remove(100);
            if (ref == null) {
                System.gc();
                count++;
            } else {
                collected++;
                System.out.println((ref == wRef ? "weak" : "phantom")
                        + " reference enqueued after " + count + " gc polls");
            }
        }
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("finalizing the object in " + Thread.currentThread());
        Thread.sleep(10000);
        System.out.println("done finalizing.");
        object = this;
    }
}

I see following output:

weak reference enqueued after 1 gc polls
finalizing the object in Thread[Finalizer,8,system]
done finalizing.

And application hangs. I think it is because for Weak/Soft references GC works in a following way: As soon as GC detected that object is Weak/Soft Reachable it does 2 actions in parallel:

  • enqueue Weak/Soft into registered ReferenceQueue instance
  • Run finalize method

So for adding into ReferenceQueue it doesn't matter if object was resurrected or not.

But for PhantomReference actions are different. As soon as GC detected that object is Phantom Reachable it does following actions sequentially:

  • Run finalize method
  • Check that object still only phantomReachable(check that object was not resurrected during finalize method execution). And Only if object is GC adds phantom reference into ReferenceQueue

But @Holger said that it has been finalized means that JVM initiated finalize() method invocation and for adding PhantomReference into ReferenceQueue it doesn't matter if it finished or not. But looks like my example shows that it really matter.

Frankly speaking I don't understand the difference according to adding into RefernceQueue for Weak and Soft Reference. What was the idea?

gstackoverflow
  • 36,709
  • 117
  • 359
  • 710
  • You should not let questions evolve in such a way. Besides that, what you’ve inserted, is horribly wrong. When the finalize method has not been invoked yet, the object *is not phantom reachable*. Whether there are phantom references or not, is entirely irrelevant for an object whose (nontrivial) finalize method has not been executed. And the garbage collector does not perform those named actions sequentially. It is *your code* that invokes `System.gc()` periodically. Under normal circumstances, nothing will happen after `finalize()` completed. – Holger Jun 24 '19 at 15:08
  • @Holger The code is awful - I agree but I want to know why it behaves like I see. The code invokes(tries) the gc just to detect what will happen if GC is starts right now. – gstackoverflow Jun 24 '19 at 15:17
  • @Holger I see that on practice GC adds phantom reachable reference into ReferenceQueue only after finalize method termination even it is very long. You disagree. Let's discuss that point first. Why do you think so ? – gstackoverflow Jun 24 '19 at 15:19
  • I did not disagree. I said, the object is considered finalized when the `finalize()` method has been invoked, but to be *phantom reachable*, it must also be “*neither strongly, softly, nor weakly reachable*”. In practice, this means, the method must be completed and *especially*, it means the object must not escape. I never said something different. But you keep ignoring that point. – Holger Jun 24 '19 at 15:24
  • But WeakReference doc contains phrase “neither strongly, softly" too But my example shoes that escaping does not prevent adding into referenceQueue. Why? – gstackoverflow Jun 24 '19 at 15:43
  • It seems, you are thinking that it was enough if these conditions were fulfilled at one point in the object’s history. But this is not the case. *All these conditions must be fulfilled at the same time*. A weak reference may get cleared because its conditions are all fulfilled at a particular point of time. But it is not phantom reachable when the finalize method has not been invoked. Then, invoking the `finalize()` method is already a resurrection, at least for some time. Now, it’s finalized, but the other conditions are not fulfilled for phantom reachability. They have to be fulfilled again. – Holger Jun 24 '19 at 15:57
  • To illustrate the matter, look at [this example program on ideone](https://ideone.com/LosVAh). It creates a new `WeakReference` after the `finalize()` method has been invoked, showing that after the `finalize()` method has been invoked, i.e. the condition making the difference has been fulfilled, the remaining identical condition will be fulfilled at the same time. We could be nitpicking here, as the spec says that the object *must not* be weakly reachable to be phantom reachable, but since the weak reference gets cleared anyway, that doesn’t matter and doesn’t deserve an additional gc cycle. – Holger Jun 24 '19 at 16:59
  • **But it is not phantom reachable when the finalize method has not been invoked.** - I agree. But on practise I see that PhantomReference might be enqueued only AFTER finalize() method TERMINATION!!! 1.Is it implementation details or is it designed so according documentation? – gstackoverflow Jun 25 '19 at 10:12
  • 1
    At one point of time, the condition “neither strongly nor softly reachable” is fulfilled, so weak references are cleared and handed over for enqueuing, but phantom references are not, because the object is not finalized. Then, the `finalize()` method is invoked, now the object is finalized, but also strongly reachable, so the first condition is not fulfilled, so the object still is not phantom reachable. The object has to become “neither strongly nor softly reachable” *again*. As demonstrated in my previous comment, this also applies to newly created weak references. – Holger Jun 25 '19 at 10:12
  • Let me some time to reread please) – gstackoverflow Jun 25 '19 at 10:14
  • So you mean that during finalize() method invocation the object becomes strongly reachable and if finalize() doesn't resurrect the object it means that after finalize method termination the object becomes phantom reachable(or even unreachable at all). Is it correct ? – gstackoverflow Jun 25 '19 at 10:19
  • If it is so I would say that I understand HOW that thing works and it corresponds the doc but I don't understand WHY it was designed so. – gstackoverflow Jun 25 '19 at 10:28
  • Exactly. That’s the usual pattern. In theory, the object may become unreachable earlier than the method completion, just like with ordinary method executions, where the optimizer may drop the reference after the last actual use of the object. That still implies that the object did not escape and the difference is hardly ever noticeable. There are several technical subtleties giving reasons for never experiencing this in practice for finalizers. – Holger Jun 25 '19 at 10:29
  • The *intent* is a different thing. As said in [this comment](https://stackoverflow.com/questions/56705169/why-enqueuing-of-phantomreference-takes-more-gc-cycles-than-weakreference-or-sof?noredirect=1#comment100036362_56735234), the difference is the reason why these two reference types exist at all. In a perfect world, there were only soft and weak references and finalization did not exist. Since we have finalization, we have a need for different semantics. But as the weird old “phantom references are not cleared” rule indicates, the semantics might not have been fully understood at that time. – Holger Jun 25 '19 at 10:35
  • @Holger Thanks for your effort) it was a hard but fruitful discussion . Now it is a good time to cool a brain after it) – gstackoverflow Jun 25 '19 at 11:17

2 Answers2

4

The key point is the definition of “phantom reachable” in the package documentation:

  • An object is phantom reachable if it is neither strongly, softly, nor weakly reachable, it has been finalized, and some phantom reference refers to it.

bold emphasis mine

Note that when we remove the finalize() method, the phantom reference gets collected immediately, together with the weak reference.

This is the consequence of JLS §12.6:

For efficiency, an implementation may keep track of classes that do not override the finalize method of class Object, or override it in a trivial way.

We encourage implementations to treat such objects as having a finalizer that is not overridden, and to finalize them more efficiently, as described in §12.6.1.

Unfortunately, §12.6.1 does not go into the consequences of “having a finalizer that is not overridden”, but it’s easy to see, that the implementation just treats those objects like being already finalized, never enqueuing them for finalization and hence, being able to reclaim them immediately, which affects the majority of all objects in typical Java applications.

Another point of view is that the necessary steps for ensuring that the finalize() method will eventually get invoked, i.e. the creation and linking of a Finalizer instance, will be omitted for objects with a trivial finalizer. Also, eliminating the creation of purely local objects after Escape Analysis, only works for those objects.

Since there is no behavioral difference between weak references and phantom references for objects without a finalizer, we can say that the presence of finalization, and its possibility to resurrect objects, is the only reason for the existence of phantom references, to be able to perform an object’s cleanup only when it is safe to assume that it can’t get resurrected anymore¹.

​​

¹ Though, before Java 9, this safety was not bullet-proof, as phantom references were not automatically cleared and deep reflection allowed to pervert the whole concept.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • **it has been finalized** means that finalize method invocation was finished? – gstackoverflow Jun 24 '19 at 12:21
  • For a nontrivial `finalize()` method, it means, the method has been invoked (and the “is neither strongly, … reachable” implies that its execution also has been completed, usually). For the objects with a trivial `finalize()` method, it may mean, the method will never be invoked, so being finalized would be their initial status. – Holger Jun 24 '19 at 12:25
  • **the implementation just treats those objects like being already finalized, never enqueuing them** Here enqueuing you mean queue for finalize method invocation? – gstackoverflow Jun 24 '19 at 12:26
  • Yes, I inserted a clarification, as “enqueuing” without attribution could indeed be misleading. – Holger Jun 24 '19 at 12:28
  • 1
    so for non-trivial case **to be finalized** means that we started invocation of finalize method and even if it takes 5min we consider the object as finalized just after we started invocation? – gstackoverflow Jun 24 '19 at 12:44
  • Yes, the JVM’s duty on finalization ends when the `finalize()` method is invoked, but keep in mind that the object has to become “neither strongly, softly, nor weakly reachable”, to be considered phantom reachable after that. – Holger Jun 24 '19 at 13:20
  • So to be 100% sure that we haven't resurected the object JVM have to await *finalize()* method termination? – gstackoverflow Jun 24 '19 at 13:22
  • But we can resurect weakReference and even Strong reference – gstackoverflow Jun 24 '19 at 13:24
  • But anyway I don't see answer to my questions: **Why Phantom Reference reclaiming requires more GC cycles than Weak and SoftReference?** Your answer that package java doc states that **it has been finalized** and we don't have this phrase for Soft and Week. But according discussion in commens you've said that if JVM started finalize method invocation it means that JVM consider the object finalized. – gstackoverflow Jun 24 '19 at 13:26
  • In practice I see that if to increase finalize method sleep up to 10 sec I see following output: **finalizing the object in Thread[Finalizer,8,system] weak reference enqueued after 1 gc polls done finalizing. phantom reference enqueued after 90 gc polls** – gstackoverflow Jun 24 '19 at 13:35
  • So enqueueing of phantom reference happens just after *finalize()* method finish – gstackoverflow Jun 24 '19 at 13:36
  • You are notoriously ignoring the point, “*neither strongly, softly, nor weakly reachable*”. When the execution of the `finalize()` method starts, the object is *strongly reachable*, as the finalizer thread is a live thread whose stack contains a reference to the object whose `finalize()` method is executed. So at this time, the weak and soft references which existed prior to the finalization have been cleared, but it requires at least an additional garbage collection cycle to detect that it is not strongly reachable anymore. – Holger Jun 24 '19 at 13:37
  • So an object is only *phantom reachable*, if its finalization started *and* no other references exist. In most practical cases, this implies that the execution of the `finalize()` method has been finished. Only in rare cases, optimized code execution could allow the object to be considered unreachable or phantom reachable while the method is still running, when it has been proven that it won’t touch the object again. This implies that resurrection is impossible at this point, even in that theoretical case. – Holger Jun 24 '19 at 13:41
  • If we resurrect PhantomReference during finalize method execution we **DON'T EXPECT** to see the reference in a linked ReferenceQueue. – gstackoverflow Jun 24 '19 at 13:51
  • If we resurrect WeakReference during finalize method execution we **EXPECT** to see the reference in a linked ReferenceQueue – gstackoverflow Jun 24 '19 at 13:52
  • What do you mean with “resurrect PhantomReference” or “resurrect WeakReference”? You are confusing the terms. You can resurrect the object, whose `finalize()` method is currently executed, but that won’t change the fact that weak references existing prior to finalization are already cleared and enqueued (or in the process of enqueuing). Whereas the phantom references are unaffected, as the object never was phantom reachable in that case. – Holger Jun 24 '19 at 13:58
  • I ignore phrase “neither strongly, softly, nor weakly reachable” because that phrase repeats(almost) for Weak and Soft references but we are speaking about differences between Weak/Soft and Phantom – gstackoverflow Jun 24 '19 at 13:58
  • Then, you will never understand the topic. You can not ignore this point, as already told you multiple times. An object whose `finalize()` method is executed is *strongly reachable* and can only become *phantom reachable* when it is *not* strongly reachable anymore. You may even create new soft or weak references in the `finalize()` method but an object can only become *phantom reachable* when none of them persists. – Holger Jun 24 '19 at 14:02
  • Acoording **resurrect anyRef**: I mean that finalize method of object saves strong reference to this to another object reachable from GC roots – gstackoverflow Jun 24 '19 at 14:03
  • I already gave you the answer to that scenario. If a strong reference exists, the object can’t be phantom reachable. That’s how it has been defined and that’s how it works. You can not ignore that phrase from the definition. And that’s the answer to why it needs at least two gc cycles to prove that. The proof that no contradicting references exist after finalization, is not different to the proof that was made prior to finalization. – Holger Jun 24 '19 at 14:05
  • looks like we are speaking about different things.Let me explain. I have object A. For now I have 2 links to Object A: Strong and Phantom. Finalize method of A object resurrects A. At this case PhantomReference **WON'T** be added to the ReferenceQueue after loosing strong reference. Another example absolutely the same but instead of PhantomReference I have WeakReference. At this case WeakReference will be added into ReferenceQueue because finalize method invocation and adding into queue are 2 independable parallel operations. Why was it designed so ? – gstackoverflow Jun 24 '19 at 14:15
  • I don’t get you. That’s the one difference between weak and phantom references. It’s why these two different reference types exist in the first place. The reason for all that mess is that in the very first Java version, the designers made the mistake of adding the concept of finalization. But this has nothing to do with the original question. The original question was why it takes more gc cycles and that has been answered. It is necessary to fulfill the contract. The relevant part of the contract has been named. – Holger Jun 24 '19 at 14:35
  • It was hard to explain what I mean in the comments so I updated the topic and provided detailed information about my question(s) – gstackoverflow Jun 24 '19 at 15:01
1

PhantomReferences will only be enqueued after any associated finalizer has finished execution. Note a finalizer can resurrect an object (used to good effect by Princeton's former Secure Internet Project).

Exact behaviour beyond the spec is not specified. Here be implementation dependent stuff.

So what seems to be happening? Once an object weakly collectable, it is also finalisable. So the WeakReferences can be enqueued and the objects queued for finalisation in the same stop-the-world event. The finalisation thread(s) is (are) running in parallel with your ReferenceQueue thread (main). Hence you may see the first two lines of your output in either order, always (unless wildly delayed) followed by the third.

Only some time after your finalizer is exited is the PhantomReference enqueueable. Hence the gc count is strictly greater. The code looks like a reasonably fair race. Perhaps changing the millisecond timeouts would change things. Most things GC don't have exact guarantees.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • So PhantomReference can be enqueued only after finalize method execution in contrast to Weak/Soft references? Is this behaviour specified? If yes, where? – gstackoverflow Jun 21 '19 at 15:42
  • "Phantom reference objects, which are enqueued after the collector determines that their referents may otherwise be reclaimed. " Objects can't be reclaimed if a `finalizer` could use them. – Tom Hawtin - tackline Jun 21 '19 at 20:49
  • I am not sure about my English so I can misunderstand smth but I can imagine the following order according that phrase from javadoc : 1. Gc detected that referent is only phantom reachable and started 2 actions in parallel : a) enqueueing into ReferenceQueue b) finalize method invocation – gstackoverflow Jun 22 '19 at 22:48
  • @gstackoverflow The `finalizer` has access to `this` so the object is strongly reachable. In theory if the `finalizer` doesn't do something involving `happens-before` on `this`, the JVM, when it gets around to optimising the code hard, could reorder the reads and reclaim the object under it. In practice `finalizers` are not written to assume this may happen. The fun required to debug that happening during finalisation, after hard JITing and dependent on a race condition, does not bare thinking about. – Tom Hawtin - tackline Jun 23 '19 at 12:47
  • Anyway I don't see direct answer to initial question. Why WeakReference can be enqueu during finalize method invocation but PhantomReference not? – gstackoverflow Jun 23 '19 at 19:54