4

I am reading about the GC, Chapter 21 in Real World OCaml, and have a few questions about the minor heap.


So it says:

The minor heap is a contiguous chunk of virtual memory that is usually a few megabytes in size so that it can be scanned quickly.

enter image description here

The runtime stores the boundaries of the minor heap in two pointers that delimit the start and end of the heap region (caml_young_start and caml_young_end, but we will drop the caml_young prefix for brevity). The base is the memory address returned by the system malloc, and start is aligned against the next nearest word boundary from base to make it easier to store OCaml values.

In a fresh minor heap, the limit equals the start, and the current ptr will equal the end. ptr decreases as blocks are allocated until it reaches limit, at which point a minor garbage collection is triggered.

You may wonder why limit is required at all, since it always seems to equal start. It's because the easiest way for the runtime to schedule a minor heap collection is by setting limit to equal end. The next allocation will never have enough space after this is done and will always trigger a garbage collection. There are various internal reasons for such early collections, such as handling pending UNIX signals, and they don't ordinarily matter for application code.


Q1: The relationship between minor heap and base

Basically, the runtime will try to allocate a chunk of memory, say, for size of 8 MB, as minor heap, using system malloc?

Then the address returned by malloc is base?

Q2: The relationship between base and start

What does it mean that start is aligned against the next nearest word boundary from base?

Does the term aligned here mean start will seek for the next memory address that mods 4 equaling to 0?

I thought malloc anyway forces alignment, i.e., always return addresses that mod 4 = 0

Q3: What we use start while we have limit?

It explained that limit can be used to schedule a GC.

However, then what's the use of start?

Jackson Tale
  • 25,428
  • 34
  • 149
  • 271

1 Answers1

3

You can find the code you're asking about in minor_gc.c. (I'm not an OCaml GC whiz--I had to look around for the code.)

Here's a stripped down version:

void caml_set_minor_heap_size (asize_t size)
{
  char *new_heap;
  void *new_heap_base;

  if (caml_young_ptr != caml_young_end) caml_minor_collection ();
  new_heap = caml_aligned_malloc(size, 0, &new_heap_base);
  if (new_heap == NULL) caml_raise_out_of_memory();
  if (caml_page_table_add(In_young, new_heap, new_heap + size) != 0)
    caml_raise_out_of_memory();

  if (caml_young_start != NULL){ 
    caml_page_table_remove(In_young, caml_young_start, caml_young_end);
    free (caml_young_base);
  }
  caml_young_base = new_heap_base;
  caml_young_start = new_heap;
  caml_young_end = new_heap + size;
  caml_young_limit = caml_young_start;
  caml_young_ptr = caml_young_end;
  caml_minor_heap_size = size;

  reset_table (&caml_ref_table);
  reset_table (&caml_weak_ref_table);
}

The base is the actual beginning address of the block returned by malloc. You need this later when you want to free the block (which you need to do if you want to change the size of the minor heap). As far as I can understand, the base has no other use.

The start is an aligned version of base. The code I'm looking at (OCaml 4.01.0) aligns to a page boundary (4 KB). Malloc only aligns for primitive data types (8 bytes or so).

You need start for resetting limit after you've modified it artificially as described in the last paragraph of your extract.

Jeffrey Scofield
  • 65,646
  • 2
  • 72
  • 108
  • `aligns to a page boundary (4 KB). Malloc only aligns for primitive data types (8 bytes or so).`, do you mean that ocaml 4.01.0 align mem by 4KB, which means `the addr mod (4*1024) = 0`? – Jackson Tale Jun 16 '14 at 12:00
  • 1
    Yes, 4KB alignment means that the address mod 4096 = 0. Or you can say that the address is a multiple of 4096. – Jeffrey Scofield Jun 16 '14 at 14:04