Why is null check performed after argument list evaluation?

Question

According to C# Language Specification 7.4.3 Function member invocation the runtime processing of function member invocation consists of the following steps, where M is instance function member declared in a reference-type, E is the instance expression:

E is evaluated. If this evaluation causes an exception, then no further steps are executed.
The argument list is evaluated.
If the type of E is a value-type, a boxing conversion is performed to convert E to type object, and E is considered to be of type object in the following steps. In this case, M could only be a member of System.Object.
The value of E is checked to be valid. If the value of E is null, a System.NullReferenceException is thrown and no further steps are executed.
The function member implementation to invoke is determined... etc

I'm wondering why null check is not the second step? Why to evaluate arguments list if E is null?

It feels like a natural order to me, in terms of the argument evaluation being sort of "preparation". I don't think I've seen anything about this in the annotated spec, which means this could really only be definitively answered by the language designers. — Jon Skeet, Sep 06 '15 at 19:49
I guess such order is affected by IL generation. We evalute all arguments and then `callvirt` performes null check and calls the function member — Andrew Karpov, Sep 06 '15 at 20:14
@AndrewKarpov Your comment seems like the most likely answer to me: it is the way it is because it turned out to be easiest to implement. — , Sep 06 '15 at 20:42

score 2 · Answer 1 · answered Sep 06 '15 at 21:37

If you were to do a null check at step 2, you would have to add a null-check to every method call.

As it is now, the vast majority of methods don't need to check that the instance isn't null. Instead they try to call the method, and if the instance is null then the attempt to get the method table to do this results in an invalid memory access, which is then trapped and turned into the NullReferenceException by the framework. There's no more work for the running code here than there would be if the instance was known a priori to not be null.

The only time the instance has to be explicitly checked for non-nullity is when an optimisation means:

The call was removed by inlining.
The inlined call doesn't involve a field access (that would lead to the null reference exception anyway).
The inlined call doesn't involve another call on the same object (ditto).
The instance can't be shown to definitely not be null (or there'd be o worries).

In this case a field-access is added in to trigger the NullReferenceException in the same way as a call would.

If however the rules required a null check before the arguments were evaluated, there'd need to be an explicit check added for every single call. And all it would mean in practice is that you had a NullReferenceException thrown prior to attempting something that would have resulted in a NullReferenceException being thrown. (They couldn't remove the logic that turns a low-address memory access violation into NullReferenceException because it still comes up in other ways).

So the rules you suggest would in practice require a lot more work.

It is after all perfectly legal to call a non-virtual method on a null instance in .NET generally, by compiling to the CIL instruction call rather than callvirt. (For that matter, you can call a virtual method non-virtually the same way, which is how a call to base works). This will work as long as there are no field accesses or virtual method calls on the instance (which in practice is rare, but can happen).

Prior to that, the rule had been that a null check was required only if the method was virtual.

This was done the same way as before; call the method with callvirt and catch the memory access violation if its called on a null reference.

When the rule was changed to (unfortunately, IMO) ban any call on null objects this was done by changing the compilation to use callvirt even when the method isn't virtual, so the memory access violation happens if the instance is null, and the resulting NullReferenceException ensues.

Abel · Answer 2 · 2015-09-06T21:12:39.867

I think by en large this is a matter of definition, though I can think of a few reasons why this is a convenient order:

Essentially, methods of objects are functions in a function table for which the object reference (or pointer) is a yet another argument. I am not sure what the calling convention is, but the rule for argument evaluation is from left-to-right. If the convention is to put the pointer on the right side, it makes sense that it is checked last, because its value is already known.
Likewise, if you consider the this-reference to be an (implicit) argument of the method call, it is actually the first argument to be checked, before any argument-checks you have written yourself inside the body of the method. Which makes sense: all arguments are evaluated, and after evaluation, they are all checked, the this-reference first.
If an argument is itself an expression with a side-effect, it makes sense to be very clear on the order of evaluation. In this case, the side effect will always fire (unless an earlier argument raises an exception).
If you have a library for pre- and post-conditions, these need to know the values of the arguments prior to calling the method. You would want to know if the arguments are valid regardless of whether the object they are called on is null or not.
Before this check, boxing may need to be done*, which is a potentially relatively expensive step. You would want to do that latest possible.
Extension methods can apply to a null object. This order makes sure that extension methods and instance methods behave the same (in an extension method, you can only check for null inside the method body, i.e. after evaluation of the arguments).
An object does not have to be on the present system, or in current processor's memory. In OO parlor (think Bertrand Meyer), a method invocation is essentially sending a message to the object. Obviously, from this viewpoint, the message must first be constructed before it can be sent. In classic COM and DCOM, this is similar: the message is constructed (i.e., arguments evaluated) and then sent. If the target then appears to not exist, be gone, destroyed, an error is raised. But the order in this scenario cannot possibly be different. I am not sure this was an argument here (dealing with out-of-process objects), but it may have been with respect to COM interoperability.

I realize that each of these arguments could have a counter-argument (except perhaps the last), but on the balance, I think it favors checking the object for null as late as possible.

* boxing typically is not expensive at all, but if your argument list is small (zero or one) and requires no further evaluation, boxing is relatively expensive compared to the virtual no-op of argument evaluation (updated because of hvd's comment).

"Before this check, boxing may need to be done, which is a potentially relatively expensive step. You would want to do that latest possible." -- Seriously? Argument evaluation is very often a far more expensive step than boxing, especially when the arguments involve more method calls. — , Sep 06 '15 at 21:06
@hvd, yes, that is probably the _weakest link_ in the list;). The thing is, you can't check for `null` before boxing, and it makes no sense to do boxing prior to evaluation of the arguments, which in themselves can be just values, so no additional evaluation required. In such cases, yes, it can be (albeit slightly) detrimental to performance. — Abel, Sep 06 '15 at 21:09
The object reference is pushed before any of the arguments if I'm not mistaken. — Asad Saeeduddin, Sep 07 '15 at 11:17

score 0 · Answer 3 · answered Sep 06 '15 at 19:52

0

Maybe because during step 3, the E type can be converted to null. By having a filter at step 2, you can allow values to pass that can be turned to null on step 3, thus needing another filter.

answered Sep 06 '15 at 19:52

iarba

1

Well, it could only be boxed to null if it were a nullable value type - which didn't exist in C# 1, and the same rules applied there. – Jon Skeet Sep 06 '15 at 20:20

score 0 · Answer 4 · answered Sep 06 '15 at 20:00

Argument evaluation of functions in fact is different in some other programming paradigms, such as in functional programming with LISP, or logical programming with Prolog, etc.

But in procedural and object-oriented programming languages, it is common to evaluate function parameters before performing the actual call. I do not know if it is a must, but it is used like that in C, C++, Java, C#, Pascal, etc. They follow the same principles.

However, do not mix this with evaluating conditions, where the short circuit rule applies.

Why is null check performed after argument list evaluation?

4 Answers4