Efficiency of load-value instructions versus load-address instructions for fields of structs

Question

Consider the following C# struct definitions:

public struct A
{
    public B B;
}

public struct B
{
    public int C;
}

Also consider the following static method:

public static int Method(A a) => a.B.C;

Calling this method will result in a copy of the struct type A. For example, in the following code:

A a = default;
Method(a);

the call to Method will compile to IL that looks something like this:

IL_0008: ldloc.0      // V_0
IL_0009: call         int32 Class::Method(valuetype A)

ldloc will copy the value of local variable a (V_0) onto the evaluation stack, and that value will be used in Method. If A (or B) was a large struct, this copy could supposedly be expensive. The IL for Method also results in load-value instructions:

IL_0000: ldarg.0      // a
IL_0001: ldfld        valuetype B A::B
IL_0006: ldfld        int32 B::C
IL_000b: ret

Recent versions of C# include features that can help make working with structs more efficient. C# 7.2 introduced the in modifier on parameters that enables the passing of a value type by reference when the compiler can verify that the argument will not be modified by the called method. For example, applying the in modifier to parameter a:

public static int Method(in A a) => a.B.C;

will result in the following compiled IL at the call site:

IL_0008: ldloca.s     a
IL_000a: call         int32 Class::Method(valuetype A&)

and in the implementation of Method:

IL_0000: ldarg.0      // a
IL_0001: ldflda       valuetype B A::B
IL_0006: ldfld        int32 B::C
IL_000b: ret

Note the load-address instructions. My assumption (please correct me if I am wrong) is that for deep field reads (such as reading C that's inside of B that's inside of A), load-address instructions are more efficient than load-value instructions.

With that in mind, consider changing the example code:

A a = default;
var c = a.B.C;

The second line then compiles to:

IL_0008: ldloc.1      // V_1
IL_0009: ldfld        valuetype B A::B
IL_000e: ldfld        int32 B::C
IL_0013: stloc.0      // c

Why wouldn't the compiler prefer to use load-address instructions in this case too? Is there an efficiency difference simply because a is a local variable versus a method parameter, or is there something else I'm missing here?

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

1

It's definitely not related to a being a local variable vs a method argument. Not from efficiency point of view, at least.

The first thing to understand is that structs in C# sit (in the memory) directly where they are declared - so directly on the stack, for local variables. More importantly - nested structs behave the same. It is possible for the JIT, in any point during runtime (not always during compilation, read more about StructLayoutAttribute) , to know exactly where B is inside of A, where C is inside of B, and where B.C lies inside of a.

When looking at the assembly code after the JIT compiles the method (it's important to compile in Release - debug builds will not get optimized the same way. Make sure the compiler doesn't optimize the variables away as well), you'll see that no matter where you type a.B.C it will always be a direct assignment from memory (in relation to where A stands in memory).

In my case, I added another variable int a1 inside A to move the memory a bit - this is the resulting code:

A a = default;

xor         ecx,ecx  
mov         qword ptr [rbp-30h],rcx

var c = a.B.C;

mov         esi,dword ptr [rbp-2Ch]

where esi is a temporary register for var c and [rbp-30h] is where a sits in the stack. B has an integer sitting in offset 0, A has an integer sitting in offset 0 and B sitting in offset 4, so the final address of a.B.C is always a+4 ([rbp-2Ch]).

edited Jun 20 '20 at 09:12

Community

1
1

answered Jun 16 '20 at 08:49

Svirin

564
1
7
20

Thank you very much for looking into this for me. I have a couple follow-up questions. You mentioned that structs sit directly on the stack for local variables. Is this also the case for method parameters passed by value? And second, at a higher level, if the JIT is going to essentially load the address of the field anyway, why wouldn't the compiler just use load-address instructions? In other words, why would the compiler leave "room for interpretation" for the JIT implementation to do something inefficient in the first place? – Wizard Brony Jun 16 '20 at 19:28
@WizardBrony In case you pass a struct to a method by value it copies your struct on the stack for the callee method. If you want to prevent struct copying, you can use `out` or `ref` keywords, it tells compiler use `by address` instructions (like ldloca) and copy on the stack address only. For the second part of the previous comment I don't have an answer. – Svirin Jun 17 '20 at 09:09
@WizardBrony if this or any answer has solved your question please consider [accepting it](https://meta.stackexchange.com/q/5234/179419) by clicking the check-mark. This indicates to the wider community that you've found a solution and gives some reputation to both the answerer and yourself. There is no obligation to do this. – Svirin Jun 17 '20 at 09:10
About the second question: The JIT won't always "essentially load the address of the field". Even in this particular case - it won't load the address of the field. Instead, it knows exactly where it sits and uses this to know where the rest of the information lies. In a nested struct, for example, we don't "load the address" of the second struct at all. It's important to understand what exactly are you asking. Is it "Why does the JIT even copies the struct and not use the address of the original?" Is it "Why does accessing A.B.C inside a method uses load by value and not load by address"? – Egozy Jun 17 '20 at 09:24
Think of the .Net compiler (Roslyn) as the one who says "what" he expects to happen, and the JIT to decide "how" to do that. The JIT has a lot more information and can make better decision making - it can cache locals or arguments in a register, it can inline methods and a lot of much more complicated decisions in runtime. – Egozy Jun 17 '20 at 09:38
Value types don't necessarily sit on the stack if they are used in local variables. If you use them in an anonymous method/lambda, they will get moved to a closure which is a normal object. – IS4 Jul 06 '20 at 14:01

Efficiency of load-value instructions versus load-address instructions for fields of structs

1 Answers1