2

I know that in CMS and G1 heap is divided into Eden, Survivor spaces and Old Generation with only difference that in CMS this division is real (these spaces are contiguous and located in different parts of memory) and in G1 it is logical (the heap is split into ±2000 dynamic regions from 1 to 32 MB each).

In both cases when Eden space is fulfilled threshold Young evacuation starts and the steps are following: Initial marking. STW. Mark the roots of the application. Concurrent marking. From the roots of the links there are transitions to objects from these objects to other objects and thus the achieved objects are marked alive. Remark. STW. Objects created during concurrent marking are also marked live (floating garbage). Cleanup This final phase prepares the ground for the upcoming evacuation phase, counting all the live objects in the heap regions, and sorting these regions by expected GC efficiency.

If the fulfillment of heap reaches the threshold then mixed evacuation (Young + Old) starts in G1. Founding dead objects in Old Generation is based on ‘Remembered Sets’. Each region has a remembered set that lists the references pointing to this region from the outside. These references are regarded as additional GC roots.

After G1 makes a decision which regions from old and young generations to add to collection set.

Why do we need remembered sets and specific search for unreachable objects in Old Generation if we have the entire graph of all objects and know all alive and dead objects after marking them on the Young evacuation level?

user207421
  • 305,947
  • 44
  • 307
  • 483

1 Answers1

0

The idea behind Remembered Set (or Card Table in CMS) is not to "search for unreachable objects in Old Generation", but to quickly identify the references in Old Generation that need to be updated when live objects are moved during Young collection.

An address of evacuated object is changed => all incoming references to this object must be updated. Without Remembered Sets it would require to scan the entire heap to find all incoming references (this, of course, may take too long).

apangin
  • 92,924
  • 10
  • 193
  • 247
  • Before operating with objects from Young g. GC scan entire heap to find unreachable so references can be updated using the result of the scanning, am I right? –  Nov 29 '17 at 12:32
  • 1
    @PavelPavel No, Mark phase is required to find **reachable** objects, not unreachable. Furthermore, Young GC does not need to walk through the whole heap, only the root set plus regions being collected. – apangin Nov 29 '17 at 13:23
  • but if we know reachable hence we know unreachable, do we? –  Nov 29 '17 at 18:12
  • @PavelPavel Not really. By definition we cannot reach unreachable objects by traversing the object graph. Full heap scan is required instead. – apangin Nov 30 '17 at 00:41
  • But I supposed that we contain necessary info about objects in meta-space and after scanning just subtract one from the other –  Nov 30 '17 at 13:54
  • @PavelPavel Metaspace is completely different concept, it has nothing to do with object graph, heap scanning etc. – apangin Nov 30 '17 at 18:17