Why is callvirt used to call a method on a readonly field of generic type

Question

Consider the following:

interface ISomething
{
    void Call(string arg);
}

sealed class A : ISomething
{
    public void Call(string arg) => Console.WriteLine($"A, {arg}");
}

sealed class Caller<T> where T : ISomething
{
    private readonly T _something;
    public Caller(T something) => _something = something;
    public void Call() => _something.Call("test");
}

new Caller<A>(new A()).Call();

Both the call to Caller<A>.Call, as well as its nested tcall to A.Call are lodged through the callvirt instruction.

But why? Both types are exactly known. Unless I'm misunderstanding something, shouldn't it be possible do use call rather than callvirt here?

If so - why is this not done? Is that merely an optimisation not done by the compiler, or is there any specific reason behind this?

The nested call cannot be changed to non-virt. The `Caller` class is not aware of what the exact type of `T` is, and the IL has to be generated at compile time for all possible `T`. The top level call can be devirtualized though. I don't know why this optimization is not there. It is possible that JIT does that, but still, it is already possible at IL level. Also with non-virt call, inlining is possible, which then allows devirtualization of the nested call as well. But C# does not inline stuff at IL level as well, for some reason most optimizations are moved to JIT. — freakish, Mar 29 '22 at 09:36
I don't follow the first part. Caller should be aware of the exact type through the generic type argument - which in this case happens to be the sealed type A right? I'd agree that the type Caller has to be explicitly created at compile time - but why could this optimisation not be done at that point? — Bogey, Mar 29 '22 at 09:48
"the type Caller has to be explicitly created at compile time" - no, .NET's generics are not C++ templates. The Constructed closed generic type doesn't exist until runtime. — Damien_The_Unbeliever, Mar 29 '22 at 09:53
Most of optimization is done by the jit, not the il compiler. — shingo, Mar 29 '22 at 09:53
@Bogey `Caller` is not created at compile time, only `Caller` is. C# generics are not the same as C++ templates. Furthermore C# compiler does not inline stuff as far as I know (but JIT does!). And so it cannot do this optimization at IL level. — freakish, Mar 29 '22 at 09:54

canton7 · Accepted Answer · 2022-03-29T10:17:41.247

5

You're missing two things.

The first is that callvirt does a null-check on the receiver, whereas call does not. This means that using callvirt on a null receiver will raise a NullReferenceException, whereas call will happily call the method and pass null as the first parameter, meaning that the method will get a this parameter which is null.

Sound surprising? It is. IIRC in very early .NET versions call was used in the way you suggest, and people got very confused about how this could be null inside a method. The compiler switched to callvirt to force the runtime to do a null-check upfront.

There are only a handful of places where the compiler will emit a call:

Static methods.
Non-virtual struct methods.
Calling a base method or base constructor (where we know the receiver is not null, and we also explicitly do not want to make a virtual call).
Where the compiler is certain that the receiver is not null, e.g. foo?.Method() where Method is non-virtual.

That last point in particular means that making a method virtual is a binary-breaking change.

Just for fun, see this check for this == null in String.Equals.

The second thing is that _something.Call("test"); is not a virtual call, it's a constrained virtual call. There's a constrained opcode which appears before it.

Constrained virtual calls were introduced with generics. The problem is that method calls on classes and on structs are a bit different:

For classes, you load the class reference (e.g. with ldloc), then use call / callvirt .
For structs, you load the address of the struct (e.g. with ldloc.a), then use call.
To call an interface method on a struct, or a method defined on object, you need to load the struct value (e.g. with ldloc), box it, then use call / callvirt.

If a generic type is unconstrained (i.e. it could be a class or a struct), the compiler doesn't know what to do: should it use ldloc or ldloc.a? Should it box or not? call or callvirt?

Constrained virtual calls move this responsibility to the runtime. To quote the doc above:

When a callvirt method instruction has been prefixed by constrained thisType, the instruction is executed as follows:

If thisType is a reference type (as opposed to a value type) then ptr is dereferenced and passed as the 'this' pointer to the callvirt of method.

If thisType is a value type and thisType implements method then ptr is passed unmodified as the 'this' pointer to a call method instruction, for the implementation of method by thisType.

If thisType is a value type and thisType does not implement method then ptr is dereferenced, boxed, and passed as the 'this' pointer to the callvirt method instruction.

This last case can occur only when method was defined on System.Object, System.ValueType, or System.Enum and not overridden by thisType. In this case, the boxing causes a copy of the original object to be made. However, because none of the methods of System.Object, System.ValueType, and System.Enum modify the state of the object, this fact cannot be detected.

edited Mar 29 '22 at 10:17

answered Mar 29 '22 at 09:53

canton7

37,633
3
64
77

Clearly the top level call to `.Call()` is not on `null`. And it can be devirtualized. There's just no way to call a different method in OP's setup. – freakish Mar 29 '22 at 09:55
@freakish The compiler simply doesn't optimize this case. Yes this very specific case is clear, but it's not particularly common in practice. It's more common to construct a type and store the result in a variable, and then call a method on the variable, and suddenly tracking whether the variable can become null in the meantime becomes harder. Also, Microsoft try to put these sorts of optimizations, such as devirtualisation, into the runtime rather than the compiler, so all CLR languages benefit. And indeed, in this case the runtime inlines the whole call – canton7 Mar 29 '22 at 09:59
Yes, tracking is harder, but as you said: JIT already does that (or so they say). I don't see a benefit of moving that optimization to JIT and thus increasing JIT compilation time. It's a weird decision to move all optimizations to JIT to be honest. Especially optimizations that are platform independent. – freakish Mar 29 '22 at 10:01
It means that they don't need to implement the same optimizations 3 times for C#, VB.NET, and F#, and 3rd-party languages also benefit. The JIT also has a lot more knowledge than the compiler: it knows what concrete types are in use at runtime, which means it can devirtualise/inline significantly more than the compiler would be able to – canton7 Mar 29 '22 at 10:03
ECMA-335 Section III.2.1 might be a better place to link https://www.ecma-international.org/wp-content/uploads/ECMA-335_5th_edition_december_2010.pdf – Charlieface Mar 29 '22 at 10:12
@Charlieface Yeah, but I can't link to an exact page there, and the MSDN docs are copied straight from the relevant sections in ECMA-335 (albeit from an old edition, but I don't think the docs on `callvirt` have changed) – canton7 Mar 29 '22 at 10:16
Highly interesting, thanks @canton7. Sounds like IF we had a way to constraint T to be a sealed class, and ensured non-nullability, the compiler should be able to emit call instead of callvirt? (Purely theoretical due to the lack of any sealed constraint) – Bogey Mar 29 '22 at 10:28
@Bogey I guess. I'm not sure whether it would bother: compiler-driven devirtualisation isn't really a big win (it's done for obviously correct cases like `?.`). The wins come when the runtime manages to do it – canton7 Mar 29 '22 at 10:39
@Bogey No, the compiler does not make any optimization for nullability. And even if you were *not* using generics, it does not optimize for sealed classes – Charlieface Mar 29 '22 at 10:39
Yeah, it doesn't even bother in [this case](https://sharplab.io/#v2:C4LglgNgPgAgTARgLAChUDcCGAnABAE1wF5cA7AUwHdcARc7Mdc/ACgEoBuVfAfgDoAYgHsh7LmhQwAzLni4AQpgDO5VAG9UuLVum50YbMACumCLIAsuYaLa41uAL7bUD1Kl0rTzWXFr1G3iAKyqooGijasjJCTNgM+OQWViLsdo4uQA). – canton7 Mar 29 '22 at 10:44
1

Fair - let's rather phrase this as, it "could" in theory legitimately optimise these cases then (even if it doesn't actually bother to do so) – Bogey Mar 29 '22 at 10:50
@Bogey - if you're constraining the type down to a single sealed type, why are you using generics? – Damien_The_Unbeliever Mar 29 '22 at 16:20
@Damien_The_Unbeliever That was more out of academic curiosity (but you could probably run into scenarios like that in a typical DI setup, where you tend to inject a small subset - if not even just one - implementation of some interface into anything that isn't a unit test) – Bogey Mar 29 '22 at 19:06

Why is callvirt used to call a method on a readonly field of generic type

1 Answers1