8

I was discovering the IL code of a simple program:

long x = 0;
for(long i = 0;i< int.MaxValue * 2L; i++)
{
    x = i;
}

Console.WriteLine(x);

I build this code in Release mode and this IL code is generated:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       28 (0x1c)
  .maxstack  2
  .locals init ([0] int64 x,
           [1] int64 i)
  IL_0000:  ldc.i4.0
  IL_0001:  conv.i8
  IL_0002:  stloc.0
  IL_0003:  ldc.i4.0
  IL_0004:  conv.i8
  IL_0005:  stloc.1
  IL_0006:  br.s       IL_000f
  IL_0008:  ldloc.1
  IL_0009:  stloc.0
  IL_000a:  ldloc.1
  IL_000b:  ldc.i4.1
  IL_000c:  conv.i8
  IL_000d:  add
  IL_000e:  stloc.1
  IL_000f:  ldloc.1
  IL_0010:  ldc.i4.s   -2
  IL_0012:  conv.u8
  IL_0013:  blt.s      IL_0008
  IL_0015:  ldloc.0
  IL_0016:  call       void [mscorlib]System.Console::WriteLine(int64)
  IL_001b:  ret
} // end of method Program::Main

I figure out pretty much all the insructions except this:

 IL_0010:  ldc.i4.s   -2

Now this insruction should push int.MaxValue * 2L onto the stack and then blt.s will compare it with i, if i is less than the value go back to the IL_0008.But, what I can't figure out is that why it loads -2 ? If I change the loop like below:

for(long i = 0;i < int.MaxValue * 3L; i++)
{
     x = i;
}

It loads the expected value:

IL_0010:  ldc.i8     0x17ffffffd

So what is the meaning of -2 in this code?

DaveShaw
  • 52,123
  • 16
  • 112
  • 141
Selman Genç
  • 100,147
  • 13
  • 119
  • 184
  • 5
    It is an optimization, taking 3 bytes of MSIL instead of 9. The -2 constant is an optimization itself, taking 1 byte instead of 4. Note how the 0 constant takes no space at all, covered by a dedicated opcode. – Hans Passant Nov 07 '14 at 13:46

2 Answers2

13

int.MaxValue * 2L is a 64-bit number, which however still fits into 32-bits (4,294,967,294, or 0xFFFFFFFE). So, the compiler is loading 0xFFFFFFFE (which is equal to -2 when interpreted as Int32) and then extending it to an unsigned 64-bit value.

The reason it used the signed form is that the number, when interpreted as a signed value -2, fits into a single signed byte (-128 to 127), meaning that the compiler was able to emit the short form ldc.i4.s opcode to load a 32-bit value from a single byte. It only took 2 bytes to load the 32-bit signed integer, and additional 1 byte to convert it to a 64-bit value - this is far better than using a 64-bit load instruction followed by a full 8 byte unsigned integer.

vgru
  • 49,838
  • 16
  • 120
  • 201
  • It probably takes this 'shortcut' to make JIT compile go faster. Not so much emphasis on making IL readable ;) Upvoted this answer as it is correct, short and concise. – AlexanderBrevig Nov 07 '14 at 13:35
  • @AlexanderBrevig: yes, loading a 32-bit constant + conv.u8 opcode takes less space than loading a 64-bit constant, probably that's the rationale. As @Hans wrote above, the `ldc.i4.s` opcode only needs a single sbyte (signed 8-bit int) parameter and extends it into a signed 32-bit value. – vgru Nov 07 '14 at 13:54
3

It looks like the compiler is using bitwise mathematics to its advantage. It just so happens that the Two's Complement value of -2 is equal to the unsigned integers value of (int.MaxValue*2L)

In the bitwise representation:

-                                          1111 1111 1111 1111 1111 1111 1111 1110 (int)
-                                          1111 1111 1111 1111 1111 1111 1111 1110 (uint)
-  0000 0000 0000 0000 0000 0000 0000 0000 1111 1111 1111 1111 1111 1111 1111 1110 (long
Matt Murrell
  • 2,321
  • 2
  • 23
  • 39