2

Given that C# GC can move memory around, how could ref-return even be implemented? Would the code below cause 'undefined behaviour'?

public struct Record
{
    public int Hash;
    public VeryLargeStruct Data;
}

public class SomeClass
{
    private Record[] _records = new Record[16];
    public ref VeryLargeStruct GetDataAt(int index) =>
                    ref _records[index].Data;
}

I would assume that if memory associated with _records reference moved that it would invalidate local references such as:

ref var data = ref someClassInstance.GetDataAt(0);
AbsZero
  • 65
  • 1
  • 6

1 Answers1

7

When GetDataAt returns by-ref, in fact, so-called managed pointer is being used. They can point inside objects - like a field of boxed struct inside an array, in your case. That's why they are also called interior pointers.

GC is able to handle them properly while marking and relocating. In other words:

  • during Mark phase, such an interior pointer is recognized as a root of the object it points into - thus your _records array won't be treated as unreachable. It basically scans the surrounding memory region to find an object that contains address represented by an interior pointer.
  • during Relocate phase (if it happens), such an interior pointer is updated properly so it will continue to point into the same place of the same object after moving it.

As a matter of the current implementation, all this is based on bricks and plug trees mechanism. If you are interested in it, I refer you to my own article about it.

Konrad Kokosa
  • 16,563
  • 2
  • 36
  • 58
  • Thank you! Especially for the link.I was just reading an article by Vladimir Sadov on the topic, which does answer the question somewhat, but does not actually reveal any implementation details. – AbsZero Jul 24 '19 at 16:16
  • I also recall having read that interior pointers are more costly to manage by GC, which is one of the reasons you're not allowed to store away such values to a data structure, and is only allowed to store them in local variables and pass them to methods, ie. up and down the call stack. This limitation manages this cost by reducing the number of such live pointers. – Lasse V. Karlsen Jul 24 '19 at 16:22
  • Yes, they have non-trivial overhead during Mark phase, so having too much of them on the heap could have an impact. – Konrad Kokosa Jul 24 '19 at 16:31