26

I asked this question and got this interesting (and a little disconcerting) answer.

Daniel states in his answer (unless I'm reading it incorrectly) that the ECMA-335 CLI specification could allow a compiler to generate code that throws a NullReferenceException from the following DoCallback method.

class MyClass {
    private Action _Callback;
    public Action Callback { 
        get { return _Callback; }
        set { _Callback = value; }
    }
    public void DoCallback() {
        Action local;
        local = Callback;
        if (local == null)
            local = new Action(() => { });
        local();
    }
}

He says that, in order to guarantee a NullReferenceException is not thrown, the volatile keyword should be used on _Callback or a lock should be used around the line local = Callback;.

Can anyone corroborate that? And, if it's true, is there a difference in behavior between Mono and .NET compilers regarding this issue?

Edit
Here is a link to the standard.

Update
I think this is the pertinent part of the spec (12.6.4):

Conforming implementations of the CLI are free to execute programs using any technology that guarantees, within a single thread of execution, that side-effects and exceptions generated by a thread are visible in the order specified by the CIL. For this purpose only volatile operations (including volatile reads) constitute visible side-effects. (Note that while only volatile operations constitute visible side-effects, volatile operations also affect the visibility of non-volatile references.) Volatile operations are specified in §12.6.7. There are no ordering guarantees relative to exceptions injected into a thread by another thread (such exceptions are sometimes called "asynchronous exceptions" (e.g., System.Threading.ThreadAbortException).

[Rationale: An optimizing compiler is free to reorder side-effects and synchronous exceptions to the extent that this reordering does not change any observable program behavior. end rationale]

[Note: An implementation of the CLI is permitted to use an optimizing compiler, for example, to convert CIL to native machine code provided the compiler maintains (within each single thread of execution) the same order of side-effects and synchronous exceptions.

So... I'm curious as to whether or not this statement allows a compiler to optimize the Callback property (which accesses a simple field) and the local variable to produce the following, which has the same behavior in a single thread of execution:

if (_Callback != null) _Callback();
else new Action(() => { })();

The 12.6.7 section on the volatile keyword seems to offer a solution for programmers wishing to avoid the optimization:

A volatile read has "acquire semantics" meaning that the read is guaranteed to occur prior to any references to memory that occur after the read instruction in the CIL instruction sequence. A volatile write has "release semantics" meaning that the write is guaranteed to happen after any memory references prior to the write instruction in the CIL instruction sequence. A conforming implementation of the CLI shall guarantee this semantics of volatile operations. This ensures that all threads will observe volatile writes performed by any other thread in the order they were performed. But a conforming implementation is not required to provide a single total ordering of volatile writes as seen from all threads of execution. An optimizing compiler that converts CIL to native code shall not remove any volatile operation, nor shall it coalesce multiple volatile operations into a single operation.

Community
  • 1
  • 1
qxn
  • 17,162
  • 3
  • 49
  • 72
  • 1
    @LukeH -- I'm inclined to believe you, but I'm just trying to better my understanding of the spec. What prevents an optimizing compiler from performing the optimization I described in my update? – qxn May 15 '12 at 12:58
  • @ken: Hmm...I didn't think about that instruction sequence mostly because a compiler almost certainly wouldn't choose it, but it looks valid to me and thus theorectically possible. I'm going to have to rethink my answer now. – Brian Gideon May 15 '12 at 14:28
  • @ken: Imagine the situation where your `Callback` property doesn't just return the value of the backing field but also mutates state too. For example, `int _count; Action _callback; public Action Callback { get { _count++; if (_count > 5) throw new Exception(); return _callback; } }`. The "optimised" version has different semantics to the non-optimised version even when running in a single thread -- ie, it'll throw sooner -- which means that the optimisation would be out-of-spec. – LukeH May 15 '12 at 15:52
  • 1
    @ken: ...Admittedly, there's nothing (theoretically) stopping the compiler/runtime performing some kind of deep inspection of all the child calls to determine which of them have visible side-effects etc and optimise accordingly. But why would the compiler/runtime do all that extra work just to figure out whether it's allowed to make a dubious "optimisation" that provides little-or-no benefit and might cause subtle bugs? – LukeH May 15 '12 at 15:57
  • 1
    @LukeH: I agree. It is very unlikely that the compiler is going to remove a local variable. In fact, they normally insert them in the form of a register reference when optimizing instead of removing them. Still, I cannot find a reasoning that would preclude that from happening at least on a purely theorectical basis. – Brian Gideon May 15 '12 at 17:14
  • 2
    @ken: ...And I suppose also that the jitter/runtime would already know when it's dealing with a simple field access and wouldn't need to do any deep inspection to determine whether there are any side-effects. In that situation -- a simple load from memory -- then I guess ECMA-335 does allow the location to be read several times rather then being cached in a local. (Having said that, it still doesn't feel like a *sensible* optimisation, even if it is permitted: loading from memory multiple times would surely be slower than loading into a local once and re-using.) – LukeH May 15 '12 at 22:19

2 Answers2

13

In CLR via C# (pp. 264–265), Jeffrey Richter discusses this specific problem, and acknowledges that it is possible for the local variable to be swapped out:

[T]his code could be optimized by the compiler to remove the local […] variable entirely. If this happens, this version of the code is identical to the [version that references the event/callback directly twice], so a NullReferenceException is still possible.

Richter suggests the use of Interlocked.CompareExchange<T> to definitively resolve this issue:

public void DoCallback() 
{
    Action local = Interlocked.CompareExchange(ref _Callback, null, null);
    if (local != null)
        local();
}

However, Richter acknowledges that Microsoft’s just-in-time (JIT) compiler does not optimize away the local variable; and, although this could, in theory, change, it almost certainly never will because it would cause too many applications to break as a result.

This question has already been asked and answered at length in “Allowed C# Compiler optimization on local variables and refetching value from memory”. Make sure to read the answer by xanatox and the “Understand the Impact of Low-Lock Techniques in Multithreaded Apps” article it cites. Since you asked specifically about Mono, you should pay attention to referenced “[Mono-dev] Memory Model?” mailing list message:

Right now we provide loose semantics close to ecma backed by the architecture you're running.

Community
  • 1
  • 1
Douglas
  • 53,759
  • 13
  • 140
  • 188
  • 1
    +1 for the detailed answer backed with solid references. Very interesting read, thanks! – dash May 15 '12 at 21:43
  • Is there a good reason the Interlocked.Read(Int64) method doesn't have overloads for other types, or is it just an oversight? – Max Barraclough May 21 '18 at 10:14
  • 1
    @MaxBarraclough: `Interlocked.Read` is meant to ensure atomic reads, which is a different goal. I suspect you could use [`Volatile.Read`](https://msdn.microsoft.com/en-us/library/gg712828(v=vs.110).aspx) for this problem. – Douglas Jun 10 '18 at 15:29
3

This code will not throw a null reference exception. This one is thread safe:

public void DoCallback() {
    Action local;
    local = Callback;
    if (local == null)
        local = new Action(() => { });
    local();
}

The reason this one is thread-safe, and can-not throw a NullReferenceException on Callback, is it's copying to a local variable before doing it's null check / call. Even if the original Callback was set to null after the null check, the local variable will still be valid.

However the following is a different story:

public void DoCallbackIfElse() {
    if (null != Callback) Callback();
    else new Action(() => { })();
}

In this one it's looking at a public variable, Callback can be changed to null AFTER the if (null != Callback) which would throw an exception on Callback();

caesay
  • 16,932
  • 15
  • 95
  • 160
  • Thanks. I'm aware that the first example you gave will not throw an exception when compiled using the Microsoft .NET compiler. But are you certain that it will never throw an exception regardless of the compiler used so long as the compiler conforms to the ECMA CLI standard? – qxn May 14 '12 at 20:05
  • 1
    Does not matter how weak the memory model is - there is no way that the first code sample in this answer can throw a NullReferenceException, unless the local variable is optimized out - and I don't expect that will happen. – caesay May 15 '12 at 03:51
  • 1
    @caesay, the question is not what can be expected to happen, but what is allowed by the spec. – svick May 15 '12 at 09:53
  • 1
    @svick: Optimisations are only allowed if they don't change the (single-threaded) semantics of the program. Since evaluating `Callback` twice does change the semantics then that wouldn't be allowed. – LukeH May 15 '12 at 10:45
  • 1
    @LukeH The JITter might realize that the getter only reads `_Callback`, and then produce code that reads `_Callback` twice (this has no observable effect in single threaded programs). I don't know for sure if the .net memory model(Or the EMCA335 model, which is weaker) allows this, but the thread safety of this code certainly relies on the memory model. That Microsoft recommends this pattern suggests that the model is strong enough, but there are certainly memory models weak enough to allow the throwing of a `NullReferenceException`, in the first piece of code. – CodesInChaos May 15 '12 at 21:33
  • @CodeInChaos: Agreed. It does appear that ECMA-335 would allow this, although it seems an unlikely "optimisation" for a compiler/runtime to make, even if it is allowed. – LukeH May 15 '12 at 22:27