4

I'm trying to understand a few things about GC G1 and hope you can guys help me.

  1. What exactly is the role of whole Concurrent Marking phase in GC G1? I mean all parts (initial marking, root region scan, ..., cleanup).

As far as I understand its role is to mark all live objects reachable from root regions (which in concurrent marking phase are survivor regions selected in 'initial mark' part) and estimate liveness for old regions (base on which they will be selected to collection set for mixed evacuation pause). Am I right?

  1. Which objects are marked as a garbage in old regions during mixed evacuation pause?

If I understand correctly, mixed evacuation pause marks and remove objects unreachable for GC roots and remembered sets. It's different set of objects that concurrent marking marked so these objects may overlap but they don't have to. Am I right?

  1. What exactly GC roots are? Are they the same for young and mixed collection (except references from remembered sets in mixed collection)?

  2. Is my summary below correct?

Fully young collection marks all live objects (which means that are reachable for GC roots) from eden and survivor regions and evacuate (copy) them to new survivor (or old) regions. Rest of objects are considered as garbage and removed. Free regions are reclaimed.

Concurrent marking phase at the beginning marks all survivor regions which may have references to old regions (base on remembered sets?). Survivor regions are considered here as GC roots because this part takes place during fully young collection so we are sure that all objects in it lives. Then GC goes through objects graph (starting from objects from selected survivor regions) and marks all live objects.

In mixed evacuation pause, base on liveness statistic (counted in the previous phase), some old regions are selected to collection set. Then all live objects from selected regions are evacuated (copied) to new regions (live means: a. selected during concurrent marking and b. as in standard evacuation pause - which are reachable from GC roots and remembered sets). Rest of objects are removed as garbage and regions are reclaimed.

Nikita Tkachenko
  • 2,116
  • 1
  • 16
  • 23
BartekN
  • 211
  • 3
  • 9

1 Answers1

1
  1. What exactly is the role of whole Concurrent Marking phase in GC G1? I mean all parts (initial marking, root region scan, ..., cleanup).

The role of Marking Phase is to begin from the roots and traverse down to all the connected components of the root and mark them live.

The "Concurrent" just means that this is being done concurrently on multiple roots.


  1. Which objects are marked as a garbage in old regions during mixed evacuation pause?

None.

The goal is to mark only live objects. Anything not marked live, is automatically (correctly) assumed dead.


  1. What exactly GC roots are? Are they the same for young and mixed collection (except references from remembered sets in mixed collection)?

This article linked in this another SO question lists the following four types of GC roots:

  1. Local variables;
  2. Active threads;
  3. Static variables; and,
  4. JNI references.

The definition of root is not tied to the collection set. A root is a root in any GC generation or collection.


  1. Is my summary below correct?

Toughest question out of all four. Allow me to skip this one. :)

displayName
  • 13,888
  • 8
  • 60
  • 75
  • I think the listed GC roots are a bit inaccurate. In regards to point 1 - whatever happens to be on live threads' stacks (be it local variables or method parameters) is considered GC roots. Point 4 is only true for static fields of classes loaded by the system classloader (as system CL is a GC root, and as a consequence all the classes it loads are as well). – Nikita Tkachenko Apr 19 '20 at 19:06
  • @NikitaTkachenko: It is surprisingly difficult to find an authoritative answer about what concrete objects are GC roots. I spent a good ~15 minutes searching for something that I thought would be so basic. However, the best answer I found is in the book "Java Performance, 2nd edition"... The answer doesn't list the roots, but tells what they semantically are. And that is... (1/2) – displayName Apr 19 '20 at 23:21
  • _"objects that are accessible from outside the heap. That primarily includes thread stacks and system classes. Those objects are always reachable, so then the GC algorithm scans all objects that are reachable via one of the root objects. Objects that are reachable via a GC root are live objects; the remaining unreachable objects are garbage (even if they maintain references to live objects or to each other)."_ (2/2) – displayName Apr 19 '20 at 23:21
  • 3
    @NikitaTkachenko this is implementation dependent. In principle, a garbage collector could be implemented by considering `static` variables only when encountering their declaring class, so there is no need to consider any static variable as GC root. However, some implementations take shortcuts, either for bootstrap loader, extension/platform loader and application aka system loader only, as they never disappear, *or* for all static variables if they don’t support class unloading. Regarding point 1, method parameters *are* local variables, the same applies to `this` on the bytecode level. – Holger Apr 20 '20 at 12:17