13

When adding an int32 to a 64-bit native int, does the CLR sign-extend or zero-extend the 32-bit integer? And most importantly: based on what information does it make this choice?


I am writing a .NET compiler and have read the ECMA specification throughly, but could not find an answer.

The CLI supports only a subset of these types in its operations upon values stored on its evaluation stack: int32, int64, and native int.
-- ECMA 335, Section I 12.1: Supported data types

Since the values on the evaluation stack have no information on their signedness, instructions for which the signedness of the operands matter have two variants: one for signed and one for unsigned integers. The add, sub and mul instructions (those that don't check for overflow) don't need to care about the signedness of the operands as long as the operands are the same size, and therefore have only a single variant. However, the operands are not always the same size...

ECMA 335, Section III 1.5: Operand type table states that an int32 and a native int can be added, subtracted, multiplied and divided. The result is again a native int. On a 64-bit system, a native int is 64 bits wide.

ldc.i4.0            // Load int32 0
conv.i              // Convert to (64-bit) native int
ldc.i4.m1           // Load int32 -1
add                 // Add native int 0 and int32 0xFFFFFFFF together

So what would be the result here? Note that, according to the specification, the runtime does not need to track the exact types or the signedness of the values on the stack: it knows only int32, int64 and native int (and some others that are not relevant here).


I would imagine that IntPtr and UIntPtr arithmetic, since it is internally represented as native ints, would also use this kind of addition. However, ILSpy shows that adding an IntPtr and an Int32 in C# calls the overloaded + operator on the IntPtr class, which accepts only a signed Int32 argument.

Doing it directly in CIL (using the add instruction) also indicates that the integer is interpreted as being signed. It should also have been implemented in Mono, but I could not find any references to back my findings up.

Daniel A.A. Pelsmaeker
  • 47,471
  • 20
  • 111
  • 157
  • The sign for the promoted value won't be caught. See this information for possible solutions (it's for Mac but that shouldn't matter in this case): https://developer.apple.com/library/mac/#documentation/Darwin/Conceptual/64bitPorting/MakingCode64-BitClean/MakingCode64-BitClean.html –  Jan 07 '13 at 08:50

2 Answers2

5

The signedness does not matter when adding two values of the same bitsize. For example, adding 32-bit -10 (0xfffffff6) to 32-bit 10 (0x0000000a) will correctly yield 0. Because of that, there is only one add instruction in the CIL (Common Instruction Language).

However, when adding two values of differing bitsizes, then the signedness does matter. For example, adding 32-bit -10 to 64-bit 10 can result in 4294967296 (0x100000000) when done unsigned, and 0 when signed.

The CIL add instruction allows adding a native integer and a 32-bit integer. The native integer may be 64-bit (on a 64-bit system). Testing reveals that add treats the 32-bit integer as a signed integer, and sign-extends it. This is not always correct and may be considered a bug. Microsoft is currently not going to fix it.

Because overflow checking depends on whether the operands are treated as being unsigned or signed, there are two variants of add.ovf: add.ovf (signed) and add.ovf.un (unsigned). However, these variants also correctly sign-extend of zero-extend the smaller operand when adding a 32-bit integer to a native integer.

So adding a native integer and an unsigned 32-bit integer may yield different results depending on the overflow checking setting of C#. Apparently the fact that I could not figure this out is the result of a bug or oversight in the CIL language design.

Community
  • 1
  • 1
Daniel A.A. Pelsmaeker
  • 47,471
  • 20
  • 111
  • 157
2

You are in uncharted territory here, I don't know of any .NET language that actually permits this. Their syntax checker reject any code that tries to do this. Even adding two native ints is rejected. Ultimately it is up to the jitter to generate the machine code for it. If you want to know what happens then just experiment. Be sure to test at least the x86 and x64 jitters.

Given the iffy semantics and the very real possibility that a future jitter change may break your assumptions, I would strongly recommend that you also reject this in your own language. It just isn't very useful and a simple workaround that casts to (long) and the result back to (IntPtr) has well defined semantics. Which in itself is a way to get predictable behavior in your own code generator.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • I added some info to the post: C# uses the overloaded addition operator when adding `IntPtr` to `Int32`. In CIL, the `add` instruction seems to interpret the `Int32` as signed. However, I don't know why. – Daniel A.A. Pelsmaeker Jan 07 '13 at 12:54