To clarify what actually happens when you cast from one type to another, it may be helpful to mention some information about how instances of reference types are stored in the CLR.
First of all, there are value types (struct
s).
- they are stored on the stack (well, it may be an "implementation detail", but IMHO we can safely assume it's the way things are),
- they don't support inheritance (no virtual methods),
- instances of value types contain only the values of their fields.
This means all methods and properties in a struct
are basically static methods with this
struct reference being passed as a parameter implicitly (again, there are one or two exceptions, like ToString
, but mostly irrelevant).
So, when you do this:
struct SomeStruct
{
public int Value;
public void DoSomething()
{
Console.WriteLine(this.Value);
}
}
SomeStruct c; // this is placed on stack
c.DoSomething();
It will be logically the same as having a static
method and passing the reference to the SomeStruct
instance (the reference part is important because it allows the method to mutate the struct contents by writing to that stack memory area directly, without the need to box it):
struct SomeStruct
{
public int Value;
public static void DoSomething(ref SomeStruct instance)
{
Console.WriteLine(instance.Value);
}
}
SomeStruct c; // this is placed on stack
SomeStruct.DoSomething(ref c); // this passes a pointer to the stack and jumps to the method call
If you called DoSomething
on a struct, there doesn't exist a different (overriden) method which may have to be invoked, and the compiler knows the actual function statically.
Reference types (class
es) are a bit more complex.
- instances of reference types are stored on the heap, and all variables or fields of a certain reference type merely hold a reference to the object on the heap. Assigning a value of a variable to another, as well as casting, simply copies the reference around, leaving the instance unchanged.
- they support inheritance (virtual methods)
- instances of reference types contain values of their fields, and some additional luggage related to GC, Synchronization, AppDomain identity and Type.
If a class method is non-virtual, then it basically behaves like a struct
method: it's known at compile time and it's not going to change, so compiler can emit a direct function call passing the object reference just like it did with a struct.
So, what happens when you cast to a different type? As far as the memory layout is concerned, nothing much.
If you have your object defined like you mentioned:
public class Base
{
public int a;
}
public class Inh : Base
{
public int b;
}
And you instantiate an Inh
, and then cast it to a Base
:
Inh i1 = new Inh() { a = 2, b = 5 };
Base b2 = i1;
The heap memory will contain a single object instance (at, say, address 0x20000000
):
// simplified memory layout of an `Inh` instance
[0x20000000]: Some synchronization stuff
[0x20000004]: Pointer to RTTI (runtime type info) for Inh
[0x20000008]: Int32 field (a = 2)
[0x2000000C]: Int32 field (b = 5)
Now, all variables of a reference type point to the location of the RTTI pointer (the actual object's memory area starts 4 bytes earlier, but that's not so important).
Both i1
and b2
contain a single pointer (0x20000004
in this example), and the only difference is that compiler will allow a Base
variable to reference only the first field in that memory area (the a
field), with no way to go further through the instance.
For the Inh
instance i1
, that same field is located at exactly the same offset, but it also has access to the next field b
located 4 bytes after the first one (at 8 byte offset from the RTTI pointer).
So if you write this:
Console.WriteLine(i1.a);
Console.WriteLine(b2.a);
Compiled code will in both cases be the same (simplified, no type checks, just addressing):
For i1
:
a. Get the address of i1 (0x20000004
)
b. Add offset of 4 bytes to get the address of a
(0x20000008
)
c. Fetch the value at that address (2
)
For b2
:
a. Get the address of b2 (0x20000004
)
b. Add offset of 4 bytes to get the address of a
(0x20000008
)
c. Fetch the value at that address (2
)
So, the one and only instance of Inh
is in memory, unmodified, and by doing a cast you are simply telling the compiler how to represent the data found at that memory location. Compared with plain C, C# will fail at runtime if you try to cast to an object which is not in the inheritance hierarchy, but a plain C program would happily return whatever is at the known fixed offset of a certain field in your instance. The only difference is that C# checks if what you are doing makes sense, but the type of the variable otherwise serves only to allow walking around the same object instance.
You can even cast it to an Object
:
Object o1 = i1; // <-- this still points to `0x20000004`
// Hm. Ok, that worked, but now what?
Again, the memory instance is unmodified, but there is nothing much you can do with a variable of Object
, except downcast it again.
Virtual methods are even more interesting, because they involve the compiler jumping through the mentioned RTTI pointer to get to the virtual method table for that type (allowing a type to override methods of a base type). This again means that the compiler will simply use the fixed offset for a particular method, but the actual instance of the derived type will have the appropriate method implementation at that location in the table.