Semantics of bool fields in explicit layout types (ECMA-334)

Question

I'm trying to find where in ECMA-334 (C# language specification) the following behavior is defined. The source program is as follows.

static void Main(string[] args)
{
    TestStruct a = new TestStruct();
    a.byteValue = 1;
    TestStruct b = new TestStruct();
    b.byteValue = 2;

    Console.WriteLine(string.Format("Result of {0}=={1} is {2}.",
        a.boolValue, b.boolValue, a.boolValue == b.boolValue));
    Console.WriteLine(string.Format("Result of {0}!={1} is {2}.",
        a.boolValue, b.boolValue, a.boolValue != b.boolValue));
    Console.WriteLine(string.Format("Result of {0}^{1} is {2}.",
        a.boolValue, b.boolValue, a.boolValue ^ b.boolValue));
}

[StructLayout(LayoutKind.Explicit, Pack = 1)]
struct TestStruct
{
    [FieldOffset(0)]
    public bool boolValue;
    [FieldOffset(0)]
    public byte byteValue;
}

The result of execution is the following.

Result of True==True is False.
Result of True!=True is True.
Result of True^True is True.

This violates both sections §14.9.4 and §14.10.3, so I'm assuming there's an exception stated elsewhere which covers these cases. Note that this does not affect code using AND, OR, NAND, or NOR operations, but it can affect code using XOR and/or logical biconditional operations.

My guess would be that while the bit values for the two fields are "true enough" that ToString() would consider them "True", the actual == and != comparisons are operating bitwise (since they're value types that should never have been mangled this way) and so the comparisons are wonky. Why are you doing this? :) — dlev, Jun 29 '12 at 18:52

score 1 · Accepted Answer · answered Jun 29 '12 at 18:50

1

I doubt that this is specified at all. By the time you're explicitly laying out your structs, you're very likely to go into architecture-specific and implementation-specific behaviour.

I strongly suspect that the behaviour you're seeing can all be explained by imagining that all bool operations are effectively converted into integer operations, and then (where necessary) converting the result by checking whether it's non-zero. Normally that's fine, so long as all bool values use the same value in memory (1 or 0), but in your case you're giving it an unexpected value (2). So although a.boolValue and b.boolValue are both true, a.boolValue ^ b.boolValue has the effect of XORing the two bytes involved, giving 3, which is still converted to true where necessary.

It's best to avoid this sort of code, IMO. Did you actually have a need for it, or were you just curious?

answered Jun 29 '12 at 18:50

Jon Skeet

1,421,763
867
9,128
9,194

The language specification explicitly states (twice) that operators `!=` and `^` produce the same result for operands of type `bool`. Unless an exception to this rule is provided in the specification, the current implementation of the C# compiler is introducing implementation-specific behavior where the specification states clear semantics. Note that this does not involve unsafe code, so architecture and implementation should not be affecting the behavior. – Sam Harwell Jun 29 '12 at 19:22
@280Z28: If you made the values `int` and `float`, then set `float` to (say) 5.0 and printed out the `int` value, that would be architecture/implementation-specific, wouldn't it? Why do you think this is different? I think it's reasonable for there to be undefined behaviour when you're *deliberately* messing with memory like this. – Jon Skeet Jun 29 '12 at 19:25
1

@280Z28 This is almost certainly not the C# compiler's fault. The CLR itself handles those operations and, likely for the sake of efficiency, assumes you have not intentionally sabotaged the operands. It would be a complete waste of everybody's time to do anything else (either at the compiler or CLR level.) – dlev Jun 29 '12 at 19:34
@dlev C# is specified in ECMA-334. The program above is a conforming program (defined on pg. 3), as opposed to a program containing `__arglist` or other implementation-specific keywords. A conforming implementation of C# is not allowed to deviate from a strictly conforming implementation for this program. Until the C# compiler is changed or the situation is addressed in ECMA-334 Annex B (specifically B.2, B.3, or B.4), the Microsoft C# compiler is not a conforming implementation of the standardized C# language. – Sam Harwell Jun 29 '12 at 19:51
@280Z28: If this is going to bother you, I could come up with any number of other bits which aren't strictly conforming, e.g. where the CLR allows a reference conversion which the C# language doesn't... – Jon Skeet Jun 29 '12 at 19:53
@280Z28 Fair enough. This is already the case, though, in a number of places (intentionally so, I might add.) See here, for an example: http://blogs.msdn.com/b/ericlippert/archive/2010/04/12/ignoring-parentheses.aspx In any case, given how you've corrupted the memory contents, it is almost impossible for the compiler to line-up with the spec. To do so, it would need to emit CIL that said "compare these two variables as booleans, and don't do any of that sneaky bit comparison stuff, I need a semantic comparison." (cont'd) – dlev Jun 29 '12 at 19:55
At the moment, CIL doesn't contain such an instruction, so the compiler would need to emit instructions to normalize the booleans (possibly using an intermediate type,) and then issue the comparison and use the result. But of course that would be stupid. I'm not sure why this bothers you so much; is this mostly for curiousity, or do you have an actual issue cropping up? – dlev Jun 29 '12 at 19:57
@dlev A program returning `(0)` (or more generally a constant expression which evaluates to 0) is not a valid program within a strictly conforming implementation. The Microsoft implementation can alter the behavior in this case while remaining a conforming implementation. – Sam Harwell Jun 29 '12 at 20:04
The issue appeared while working on an experimental AOT compiler+verifier for ECMA-335. – Sam Harwell Jun 29 '12 at 20:05
@280Z28 Interesting. I think it's just very hard to separate compiler behavior from CLR behavior, since managed code is run in the CLR. No matter what CIL the compiler generates, you can construct a CLR that runs the code in a non-conforming way. In this case, I might concede that this is a "bug" in the CLR which causes the compiler to be non-conforming. I just don't think there's a sensible way for *any* C# compiler to produce CIL that would make it conform in this case. – dlev Jun 29 '12 at 20:12
The easiest way to make everything conform is simply make a note of the issue in Annex B, B.4. – Sam Harwell Jun 29 '12 at 20:25

Semantics of bool fields in explicit layout types (ECMA-334)

1 Answers1