Why c# compiler in some cases emits newobj/stobj rather than 'call instance .ctor' for struct initialization

Question

here some test program in c#:

using System;


struct Foo {
    int x;
    public Foo(int x) {
        this.x = x;
    }
    public override string ToString() {
        return x.ToString();
    }
}

class Program {
    static void PrintFoo(ref Foo foo) {
        Console.WriteLine(foo);
    }
    
    static void Main(string[] args) {
        Foo foo1 = new Foo(10);
        Foo foo2 = new Foo(20);
        
        Console.WriteLine(foo1);
        PrintFoo(ref foo2);
    }
}

and here disassembled compiled version of method Main:

.method private hidebysig static void Main (string[] args) cil managed {
    // Method begins at RVA 0x2078
    // Code size 42 (0x2a)
    .maxstack 2
    .entrypoint
    .locals init (
        [0] valuetype Foo foo1,
        [1] valuetype Foo foo2
    )

    IL_0000: ldloca.s foo1
    IL_0002: ldc.i4.s 10
    IL_0004: call instance void Foo::.ctor(int32)
    IL_0009: ldloca.s foo2
    IL_000b: ldc.i4.s 20
    IL_000d: newobj instance void Foo::.ctor(int32)
    IL_0012: stobj Foo
    IL_0017: ldloc.0
    IL_0018: box Foo
    IL_001d: call void [mscorlib]System.Console::WriteLine(object)
    IL_0022: ldloca.s foo2
    IL_0024: call void Program::PrintFoo(valuetype Foo&)
    IL_0029: ret
} // end of method Program::Main

I don't get why newobj/stobj was emitted instead of simple call .ctor ? To make it more mysterious, newobj+stobj optimized by jit-compiler in 32 bit mode to one ctor call, but it doesn't in 64 bit mode...

UPDATE:

To clarify my confusion, below are my expectation.

value-type declaration expression like

Foo foo = new Foo(10)

should be compiled via

call instance void Foo::.ctor(int32)

value-type declaration expression like

Foo foo = default(Foo)

should be compiled via

initobj Foo

in my opinion temp variable in case of construction expression, or instance of default expression should be considered as target variable, as this could not follow to any dangerous behaviour

try{
    //foo invisible here
    ...
    Foo foo = new Foo(10);
    //we never get here, if something goes wrong
}catch(...){
    //foo invisible here
}finally{
    //foo invisible here
}

assignment expression like

foo = new Foo(10); // foo declared somewhere before

should be compiled to something like this:

.locals init (
    ...
    valuetype Foo __temp,
    ...
)

...
ldloca __temp
ldc.i4 10
call instance void Foo::.ctor(int32)
ldloc __temp
stloc foo
...

this the way i understand what C# specification says:

7.6.10.1 Object creation expressions

...

The run-time processing of an object-creation-expression of the form new T(A), where T is class-type or a struct-type and A is an optional argument-list, consists of the following steps:

...

If T is a struct-type:

An instance of type T is created by allocating a temporary local variable. Since an instance constructor of a struct-type is required to definitely assign a value to each field of the instance being created, no initialization of the temporary variable is necessary.

The instance constructor is invoked according to the rules of function member invocation (§7.5.4). A reference to the newly allocated instance is automatically passed to the instance constructor and the instance can be accessed from within that constructor as this.

i want to make emphasis on "allocating a temporary local variable". and in my understanding newobj instruction assume creation of object on heap...

Dependency of object creation from how it used make me down in this case, as foo1 and foo2 looks identically for me.

possible duplicate of [Difference between call instance vs newobj instance in IL](http://stackoverflow.com/questions/11966930/difference-between-call-instance-vs-newobj-instance-in-il) — Justin Pihony, Mar 04 '13 at 18:03
Curiosity is okayish, but this one requires significant digging. Download SSCLI20 and look at the C# compiler source code, csharp/sscomp/ilgen.cpp, ILGENREC::genCall() method. Something to do with possible aliasing, I think. — Hans Passant, Mar 04 '13 at 18:24
@HansPassant: You are correct; the C# optimizer is skipping the copy elision because it is concerned about possible aliasing. In this particular code that is an overly conservative assumption; there is actually no aliasing problem here. But rather than doing that analysis, the C# compiler detects the ref and gives up on the elision optimization early. — Eric Lippert, Mar 04 '13 at 19:02
Re: your update: why do you believe that the `newobj` instruction allocates memory on the heap for value types? It is not *documented* as doing so; I am confused as to why you would believe something that is directly contradicted by the documentation. — Eric Lippert, Mar 05 '13 at 05:04
@Eric Lippert: yes indeed, CLI specification asserts that newobj for value type should allocate memory on stack, thanks for pointed it out — andrey.ko, Mar 05 '13 at 08:11

Eric Lippert · Accepted Answer · 2021-01-13T20:13:52.910

First off, you should read my article on this subject. It does not address your specific scenario, but it has some good background information:

https://ericlippert.com/2010/10/11/debunking-another-myth-about-value-types/

OK, so now that you've read that you know that the C# specification states that constructing an instance of a struct has these semantics:

Create a temporary variable to store the struct value, initialized to the default value of the struct.
Pass a reference to that temporary variable as the "this" of the constructor

So when you say:

Foo foo = new Foo(123);

That is equivalent to:

Foo foo;
Foo temp = default(Foo);
Foo.ctor(ref temp, 123); // "this" is a ref to a variable in a struct.
foo1 = temp;

Now, you might ask why go through all the trouble of allocating a temporary when we already have a variable foo right there that could be this:

Foo foo = default(Foo);
Foo.ctor(ref foo, 123);

That optimization is called copy elision. The C# compiler and/or the jitter are permitted to perform a copy elision when they determine using their heuristics that doing so is always invisible. There are rare circumstances in which a copy elision can cause an observable change in the program, and in those cases the optimization must not be used. For example, suppose we have a pair-of-ints struct:

Pair p = default(Pair);
try { p = new Pair(10, 20); } catch {}
Console.WriteLine(p.First);
Console.WriteLine(p.Second);

We expect that p here is either (0, 0) or (10, 20), never (10, 0) or (0, 20), even if the ctor throws halfway through. That is, either the assignment to p was of the completely constructed value, or no modification was made to p at all. The copy elision cannot be performed here; we have to make a temporary, pass the temporary to the ctor, and then copy the temporary to p.

Similarly, suppose we had this insanity:

Pair p = default(Pair);
p = new Pair(10, 20, ref p);
Console.WriteLine(p.First);
Console.WriteLine(p.Second);

If the C# compiler performs the copy elision then this and ref p are both aliases to p, which is observably different than if this is an alias to a temporary! The ctor could observe that changes to this cause changes to ref p if they alias the same variable, but would not observe that if they aliased different variables.

The C# compiler heuristic is deciding to do the copy elision on foo1 but not foo2 in your program. It is seeing that there is a ref foo2 in your method and deciding right there to give up. It could do a more sophisticated analysis to determine that it is not in one of these crazy aliasing situations, but it doesn't. The cheap and easy thing to do is to just skip the optimization if there is any chance, however remote, that there could be an aliasing situation that makes the elision visible. It generates the newobj code and let the jitter decide whether it wants to make the elision.

As for the jitter: the 64 bit and 32 bit jitters have completely different optimizers. Apparently one of them is deciding that it can introduce the copy elision that the C# compiler did not, and the other one is not.

thanks, i found your article very cognitional, but i think that there should be distinction between "assignment expression" and "declaration expression" and in last case i don't see any dangerous cases, i write update to my question as it seems to me — andrey.ko, Mar 05 '13 at 01:56
There aren't any dangerous cases. The compiler is simply not detecting that there's an optimization that it could perform. The compiler is not required to generate optimal code, it's only required to generate correct code. — Eric Lippert, Mar 05 '13 at 05:05

score 0 · Answer 2 · answered Mar 04 '13 at 18:28

0

That's because the variables foo1 and foo2 are different.

The foo1 variable is just a value, but the foo2 variable is both a value and a pointer as it's used in a call with the ref keyword.

When the foo2 variable is initialised, the pointer is set up to point to the value, and the constructor is called with the value of the pointer rather than the address of the value.

If you set up two PrintFoo methods with the only difference that one has the ref keyword, and call them with one variable each:

Foo a = new Foo(10);
Foo b = new Foo(20);
PrintFoo(ref a);
PrintFoo(b);

If you decompile the generated code, the difference between the variables is visible:

&Foo a = new Foo(10);
Foo b = new Foo(20);
Program.PrintFoo(ref a);
Program.PrintFoo(b);

answered Mar 04 '13 at 18:28

Guffa

687,336
108
737
1,005

I'm not sure I'm following your train of thought here. In the generated IL in the original question the types of local slots zero and one are both Foo; neither is managed-pointer-to-Foo. The managed pointer is created by the load-local-address opcode. – Eric Lippert Mar 04 '13 at 18:37
@EricLippert: Yes, but it clearly shows up as a different data type in the decompiled code, and if you look at the generated code there are two slots allocated on the stack for `foo2` and only one for `foo1`. – Guffa Mar 04 '13 at 18:49
1

I guess what I'm getting at here is: the question is about why the C# compiler produces a particular sequence of IL. Appealing to what some third-party decompiler produces when given that IL doesn't explain why the C# compiler is choosing to generate that IL in the first place. The point of the question is that there is no need to initialize foo1 and foo2 differently. – Eric Lippert Mar 04 '13 at 18:52

Why c# compiler in some cases emits newobj/stobj rather than 'call instance .ctor' for struct initialization

2 Answers2