Possible bug in C# JIT optimizer?

Question

Working on a SQLHelper class to automate stored procedures calls in a similar way to what is done in the XmlRpc.Net library, I have hit a very strange problem when running a method generated manually from IL code.

I've narrowed it down to a simple generated method (probably it could be simplified even more). I create a new assembly and type, containing two methods to comply with

public interface iTestDecimal
{
    void TestOk(ref decimal value);
    void TestWrong(ref decimal value);
}

The test methods are just loading the decimal argument into the stack, boxing it, checking if it's NULL, and if it is not, unboxing it.

The generation of TestOk() method is as follows:

static void BuildMethodOk(TypeBuilder tb)
{
    /* Create a method builder */
    MethodBuilder mthdBldr = tb.DefineMethod( "TestOk", MethodAttributes.Public | MethodAttributes.Virtual,
      typeof(void), new Type[] {typeof(decimal).MakeByRefType() });

    ParameterBuilder paramBldr = mthdBldr.DefineParameter(1,  ParameterAttributes.In | ParameterAttributes.Out, "value");
    // generate IL
    ILGenerator ilgen = mthdBldr.GetILGenerator();

    /* Load argument to stack, and box the decimal value */
    ilgen.Emit(OpCodes.Ldarg, 1);

    ilgen.Emit(OpCodes.Dup);
    ilgen.Emit(OpCodes.Ldobj, typeof(decimal));
    ilgen.Emit(OpCodes.Box, typeof(decimal));

    /* Some things were done in here, invoking other method, etc */
    /* At the top of the stack we should have a boxed T or null */

    /* Copy reference values out */

    /* Skip unboxing if value in the stack is null */
    Label valIsNotNull = ilgen.DefineLabel();
    ilgen.Emit(OpCodes.Dup);

    /* This block works */
    ilgen.Emit(OpCodes.Brtrue, valIsNotNull);
    ilgen.Emit(OpCodes.Pop);
    ilgen.Emit(OpCodes.Pop);
    ilgen.Emit(OpCodes.Ret);
    /* End block */

    ilgen.MarkLabel(valIsNotNull);
    ilgen.Emit(OpCodes.Unbox_Any, typeof(decimal));

    /* Just clean the stack */
    ilgen.Emit(OpCodes.Pop);
    ilgen.Emit(OpCodes.Pop);
    ilgen.Emit(OpCodes.Ret);
}

The building for TestWrong() is nearly identical:

static void BuildMethodWrong(TypeBuilder tb)
{
    /* Create a method builder */
    MethodBuilder mthdBldr = tb.DefineMethod("TestWrong", MethodAttributes.Public | MethodAttributes.Virtual,
    typeof(void), new Type[] { typeof(decimal).MakeByRefType() });

    ParameterBuilder paramBldr = mthdBldr.DefineParameter(1,  ParameterAttributes.In | ParameterAttributes.Out, "value");

    // generate IL
    ILGenerator ilgen = mthdBldr.GetILGenerator();

    /* Load argument to stack, and box the decimal value */
    ilgen.Emit(OpCodes.Ldarg, 1);
    ilgen.Emit(OpCodes.Dup);
    ilgen.Emit(OpCodes.Ldobj, typeof(decimal));
    ilgen.Emit(OpCodes.Box, typeof(decimal));

    /* Some things were done in here, invoking other method, etc */
    /* At the top of the stack we should have a boxed decimal or null */

    /* Copy reference values out */

    /* Skip unboxing if value in the stack is null */
    Label valIsNull = ilgen.DefineLabel();
    ilgen.Emit(OpCodes.Dup);

    /* This block fails */
    ilgen.Emit(OpCodes.Brfalse, valIsNull);
    /* End block */

    ilgen.Emit(OpCodes.Unbox_Any, typeof(decimal));
    ilgen.MarkLabel(valIsNull);

    /* Just clean the stack */
    ilgen.Emit(OpCodes.Pop);
    ilgen.Emit(OpCodes.Pop);
    ilgen.Emit(OpCodes.Ret);
}

The only difference is I'm using BrFalse instead of BrTrue to check if the value in the stack is null.

Now, running the following code:

iTestDecimal testiface = (iTestDecimal)SimpleCodeGen.Create();

decimal dectest = 1;
testiface.TestOk(ref dectest);
Console.WriteLine(" Dectest: " + dectest.ToString());

The SimpleCodeGen.Create() is creating a new assembly and type, and calling the BuildMethodXX above to generate the code for TestOk and TestWrong. This works as expected: does nothing, value of dectest is not changed. However, running:

iTestDecimal testiface = (iTestDecimal)SimpleCodeGen.Create();

decimal dectest = 1;
testiface.TestWrong(ref dectest);
Console.WriteLine(" Dectest: " + dectest.ToString());

the value of dectest is corrupted (sometimes it gets a big value, sometimes it says "invalid decimal value", ...) , and the program crashes.

May this be a bug in the JIT, or am I doing something wrong?

Some hints:

In debugger, it happens only when "Suppress JIT optimizations" is disabled. If "Suppress JIT optimizations" is enabled, it works. This makes me think the problem must be in the JIT optimized code.
Running the same test on Mono 2.4.6 it works as expected, so this is something specific for Microsoft .NET.
Problem appears when using datetime or decimal types. Apparently, it works for int, or for reference types (for reference types, the generated code is not identical, but I'm omiting that case as it works).
I think this link, reported long time ago, might be related.
I've tried .NET framework v2.0, v3.0, v3.5 and v4, and behavior is exactly the same.

I'm omitting the rest of the code, creating the assembly and type. If you want the full code, just ask me.

Thanks very much!

Edit: I'm including the rest of the assembly and type creation code, for completion:

class SimpleCodeGen
{
    public static object Create()
    {
        Type proxyType;

        Guid guid = Guid.NewGuid();
        string assemblyName = "TestType" + guid.ToString();
        string moduleName = "TestType" + guid.ToString() + ".dll";
        string typeName = "TestType" + guid.ToString();

        /* Build the new type */
        AssemblyBuilder assBldr = BuildAssembly(typeof(iTestDecimal), assemblyName, moduleName, typeName);
        proxyType = assBldr.GetType(typeName);
        /* Create an instance */
        return Activator.CreateInstance(proxyType);
    }

    static AssemblyBuilder BuildAssembly(Type itf, string assemblyName, string moduleName, string typeName)
    {
        /* Create a new type */
        AssemblyName assName = new AssemblyName();
        assName.Name = assemblyName;
        assName.Version = itf.Assembly.GetName().Version;
        AssemblyBuilder assBldr = AppDomain.CurrentDomain.DefineDynamicAssembly(assName, AssemblyBuilderAccess.RunAndSave);
        ModuleBuilder modBldr = assBldr.DefineDynamicModule(assName.Name, moduleName);
        TypeBuilder typeBldr = modBldr.DefineType(typeName,
          TypeAttributes.Class | TypeAttributes.Sealed | TypeAttributes.Public, 
          typeof(object), new Type[] { itf });

        BuildConstructor(typeBldr, typeof(object));
        BuildMethodOk(typeBldr);
        BuildMethodWrong(typeBldr);
        typeBldr.CreateType();
        return assBldr;
    }

    private static void BuildConstructor(TypeBuilder typeBldr, Type baseType)
    {
        ConstructorBuilder ctorBldr = typeBldr.DefineConstructor(
          MethodAttributes.Public | MethodAttributes.SpecialName |
          MethodAttributes.RTSpecialName | MethodAttributes.HideBySig,
          CallingConventions.Standard,
          Type.EmptyTypes);

        ILGenerator ilgen = ctorBldr.GetILGenerator();
        //  Call the base constructor.
        ilgen.Emit(OpCodes.Ldarg_0);
        ConstructorInfo ctorInfo = baseType.GetConstructor(System.Type.EmptyTypes);
        ilgen.Emit(OpCodes.Call, ctorInfo);
        ilgen.Emit(OpCodes.Ret);
    }

    static void BuildMethodOk(TypeBuilder tb)
    {
        /* Code included in examples above */
    }

    static void BuildMethodWrong(TypeBuilder tb)
    {
        /* Code included in examples above */           
    }
}

Is it the intention to fall through to label valIsNull even if the vaue is not null? This is different in the other code. — Ingo, May 11 '11 at 08:39
Yes, it shouldn't matter. In the first code, if value is **not** null, I branch to valIsNotNull, so the unboxing is done. Finally, clean stack, and return (in the original code there was an stobj instead of a cleaning up, but it's not relevant here). If the branch is not taken, just clean the stack and return (code is duplicated) In the second code, BuildMethodTestWrong, the cleanup and return code is only once. If value is null, the branch is taken to the label valIsNull, and the unboxing is skipped. As far as I know it is not incorrect to fall through the label, right? Thanks very much. — Álvaro Iradier, May 11 '11 at 08:42
Have you run PEVerify on the generated assembly? Normally that will point out an error very quickly. — leppie, May 11 '11 at 10:42
Also, where are setting up the interface implementation when creating the method? — leppie, May 11 '11 at 10:43
@Alvaro - it is ok to join paths through the program, provided the stack looks the same. I was just mentioning it because you said the brNull instaed of brTrue was the only difference. In the first code, there is 1 pop and a ret, whereas in the 2nd you fall through to popping twice. — Ingo, May 11 '11 at 10:50
@leppie Writing the assembly to disk and running PEVerify the only error is "missing an assembly manifest". I've ommited the assembly and type creation, and adding it now. — Álvaro Iradier, May 11 '11 at 11:56
I can almost guarentee you that it is not a bug in the JIT; somewhere in your generated IL it is not doing something safe or verifiable. — thecoop, May 11 '11 at 12:04
Shouldn't the parameter be declared `ref` rather than `out`, since you're reading it & boxing the result at the start of the method? — thecoop, May 11 '11 at 12:18
@thecoop Thanks, I've tried using ref instead of out, it makes sense, but the result is the same again. By the way, what is the difference exactly between? As far as I know they are treated exactly the same in the generated IL code. The difference, I think, is the compiler forces you to initializa a ref parameter before calling (not necessary for an out parameter), and forces you to assign a value before leaving the method if it is an out parameter. Am I wrong? Thanks. — Álvaro Iradier, May 11 '11 at 12:25
@thecoop, @Álvaro Iradier: On CLR level there is no difference between C#'s out and ref. — leppie, May 11 '11 at 12:37
This indeed looks weird. Note that if you use Unbox instead of Unbox_Any, it will work. — Jb Evain, May 11 '11 at 13:44
@Álvaro: You wrote: "In debugger, it happens only when 'Suppress JIT optimizations' is disabled. If 'Suppress JIT optimizations' is enabled, it works. This makes me think the problem must be in the JIT optimized code." ... You mean the bug happens when JIT optimisations are off - is that the right way round? — codeulike, May 11 '11 at 14:02
Thanks @Jb Evain. With Unbox it just won't crash, but it won't work as expected in the full code (this is a minimalistic example). Unbox_Any will Unbox + LdObj (Just unboxing puts the address of the value in the stack, then LdObj takes the address and puts the value in the stack). It looks to me the stack is being corrupted or something with the BrFalse, so the Unbox_Any (equal to Unbox+LdObj) is breaking something. Just removing the Unbox_any will make this example not crash, but it won't work as expected in the full context. — Álvaro Iradier, May 11 '11 at 14:03
@codeulike the bug happens when JIT optimizations are **enabled**. If optimizations are suppressed (that's the option I check), then it works perfectly. It works on Mono too. So, I think it's something with the JIT optimization. — Álvaro Iradier, May 11 '11 at 14:04
@alvaro: Thanks, Ok yeah, the double negative was messing with my brain. — codeulike, May 11 '11 at 14:10
@ingo: "it is ok to join paths through the program, provided the stack looks the same." - you had the answer right there, it turns out : ) — codeulike, May 11 '11 at 14:25
@codeulike - Wel, if I ask a question I do it for a reason ... — Ingo, May 11 '11 at 14:29

kvb · Accepted Answer · 2011-05-11T14:17:10.503

7

Look at this part of your code:

ilgen.Emit(OpCodes.Dup);
ilgen.Emit(OpCodes.Brfalse, valIsNull);
ilgen.Emit(OpCodes.Unbox_Any, typeof(decimal));
ilgen.MarkLabel(valIsNull);

After the first line, the top of the stack will contain two object references. You then conditionally branch, removing one of the references. The next line unboxes the reference to a decimal value. So where you mark your label, the top of the stack is either an object reference (if the branch was taken) or a decimal value (if it wasn't). These stack states are not compatible.

EDIT

As you point out in your comment, your IL code following this would work if the stack state has a decimal on top or if it has an object reference on top, since it just pops the value off of the stack either way. However, what you're trying to do still won't work (by design): there needs to be a single stack state at each instruction. See section 1.8.1.3 (Merging stack states) of the ECMA CLI spec for more details.

edited May 11 '11 at 14:17

answered May 11 '11 at 14:06

kvb

54,864
2
91
133

1

What do you mean they are not compatible? In this minimalistic code, I just do 2 POPs from the stack to remove both references (if the branch was taken), or to remove the Value and the original reference if the branch was not taken. In the *real* code, I would use the reference and the value to do a STOBJ. Is the POP capable of popping either a decimal value or an object reference? – Álvaro Iradier May 11 '11 at 14:10
Looks like you're right on the track: [See this](http://drdobbs.com/184416570). "Correct IL must never allow a branch instruction to result in two or more incompatible stack-states, at any single instruction point. By "incompatible stack-state" we refer not only to the depth of the stack, but also to the types of the items on it." But shouldn't the JIT raise an InvalidProgramException in this case? – Álvaro Iradier May 11 '11 at 14:16
Adding a POP and a DUP just after the UNBOX_ANY and before the MarkLabel makes the example work, so you are right! Questiona are, first, shouldn't this complain instead of running and corrupting the data? Second, why does it work with optimization disabled? – Álvaro Iradier May 11 '11 at 14:20
1

I guess the JIT optimisations assume the stack is compatible in both cases and take some sort of shortcut, whereas the unoptimised version doesn't. – codeulike May 11 '11 at 14:21
2

@Álvaro - it would be nice if the CLR gave you better feedback indicating that your IL was invalid. However, in almost all cases the CLR will be running valid IL which was generated by a compiler, so verifying the IL would be a performance hit for little benefit (and you can always run PEVerify on a generated assembly if you want to be safe). If things are dramatically broken you might get an `InvalidProgramException`, but you can't rely on one being generated. – kvb May 11 '11 at 14:38
Is it even possible for a machine to verify that the stack is compatible for all paths? That sounds a bit like the 'halting problem'. – codeulike May 11 '11 at 14:40
1

Thanks @kvb, but in this case, writing the generated assembly to disk and running PEVerify gave no error (just missing manifest). – Álvaro Iradier May 11 '11 at 14:41

Possible bug in C# JIT optimizer?

1 Answers1