9

Now filed on Microsoft Connect; please upvote if you feel it needs fixing. I've also simplified the test case a lot:

byte* data = (byte*) 0x76543210;
uint offset = 0x80000000;
byte* wrong = data + offset;
byte* correct = data + (uint) 0x80000000;

// "wrong" is now 0xFFFFFFFFF6543210 (!)
// "correct" is 0xF6543210

Looking at the IL, as far as I can tell, the C# compiler did everything right, and the bug lies in the JITter.


Original question: What is going on here?

byte* data = (byte*)Marshal.AllocHGlobal(0x100);

uint uioffset = 0xFFFF0000;
byte* uiptr1 = data + uioffset;
byte* uiptr2 = data + (uint)0xFFFF0000;

ulong uloffset = 0xFFFF0000;
byte* ulptr1 = data + uloffset;
byte* ulptr2 = data + (ulong)0xFFFF0000;

Action<string, ulong> dumpValue =
    (name, value) => Console.WriteLine("{0,8}: {1:x16}", name, value);

dumpValue("data",     (ulong)data);
dumpValue("uiptr1",   (ulong)uiptr1);
dumpValue("uiptr2",   (ulong)uiptr2);
dumpValue("ulptr1",   (ulong)ulptr1);
dumpValue("ulptr2",   (ulong)ulptr2);

This test requires a 64-bit OS targeting the x64 platform.

Output:

  data: 000000001c00a720    (original pointer)
uiptr1: 000000001bffa720    (pointer with a failed carry into the higher dword)
uiptr2: 000000011bffa720    (pointer with a correct carry into the higher dword)
ulptr1: 000000011bffa720    (pointer with a correct carry into the higher dword)
ulptr2: 000000011bffa720    (pointer with a correct carry into the higher dword)
               ^
               look here

So is this a bug or did I mess something up?

Roman Starkov
  • 59,298
  • 38
  • 251
  • 324
  • Are you compiling for x86 or x64 or AnyCPU? – user541686 Jan 04 '12 at 00:32
  • My guess is that you are since you are adding a pointer to a `uint` variable in the first case, the compiler (for some reason) chooses to do 32-bit arithmetic which results in a 32-bit value which is then promoted back to the pointer (truncated). In the second, the 32-bit immediate value gets promoted to a 64-bit value and the compiler does 64-bit arithmetic preserving the higher dword. – Jeff Mercado Jan 04 '12 at 00:33
  • @Mehrdad Compiling for x64, else `(ulong) ptr2` couldn't possibly spill into the 33rd bit. – Roman Starkov Jan 04 '12 at 00:35
  • Looking at the end of section 18.5.6 of the spec: `If a pointer arithmetic operation overflows the domain of the pointer type, the result is truncated in an implementation-defined fashion, but no exceptions are produced.` I'm not sure what they mean by pointer type, a pointer is a pointer. Perhaps it is relevant here? – Jeff Mercado Jan 04 '12 at 00:42
  • @JeffMercado Maybe... to say for sure, we need to know what they mean by "overflows". I'm not convinced this counts as an overflow - after all, a `ulong` + `uint` does not result in an overflow unless the `ulong` overflows. – Roman Starkov Jan 04 '12 at 00:45
  • @romkyns: If you don't mind, I modified your test case a bit to include the explicit 64-bit arithmetic. I'm thinking bug in the compiler or the spec's wording. – Jeff Mercado Jan 04 '12 at 01:11
  • This is related to an architectural limitation of x64. Pointer offsets cannot be larger than 2^31-1. The Connect article's premise is weak, you can never have a chunk of memory larger than 2GB. Clearly not in .NET but not even malloc() in C lets you create a larger chunk. – Hans Passant Jan 04 '12 at 10:15

2 Answers2

4

I think you are encountering this C# compiler bug: https://connect.microsoft.com/VisualStudio/feedback/details/675205/c-compiler-performs-sign-extension-during-unsigned-pointer-arithmetic

Which was filed as a result of this question: 64-bit pointer arithmetic in C#, Check for arithmetic overflow changes behavior

Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • [Filed here](https://connect.microsoft.com/VisualStudio/feedback/details/716436/addition-of-a-byte-and-a-uint-results-in-the-wrong-pointer-value-on-x64) with a much simpler repro case. Curious, it looks very much related to what you link to, but as far as I can tell, the MSIL is _correct_ and so must be JITted wrongly. – Roman Starkov Jan 04 '12 at 02:32
  • @romkyns, did you read Grant's exposition on that other question? The MSIL's wrong, it causes sign-extension, since it doesn't use the `add.un` instruction. Also, please tell us whether you're using the `/checked` compile option or not. – Ben Voigt Jan 04 '12 at 02:43
  • This code was compiled without `/checked`, and there are no explicit directives in the code to change this. You're right, it's actually the same bug. Also, for the benefit of future readers of these comments: `add.un` is an imaginary instruction; the missing combination of (un)checked & (un)signed. – Roman Starkov Jan 04 '12 at 10:57
3

(Answer under construction)

I checked the emitted x64 asm and these are my observations:

Base pointer:

data:
00000000024539E0

Pointer with correct carry:

data + (uint)0xFFFF0000:
00000001024439E0

Disassembly of the instructions:

    byte* ptr2 = data + ((uint)0xFFFF0000); // redundant cast to be extra sure
00000084  mov         ecx,0FFFF0000h 
00000089  mov         rax,qword ptr [rsp+20h] 
0000008e  add         rax,rcx 
00000091  mov         qword ptr [rsp+38h],rax 

Pointer with incorrect carry:

data + offset:
00000000024439E0

Disassembly of the instructions:

    uint offset = 0xFFFF0000;
0000006a  mov         dword ptr [rsp+28h],0FFFF0000h 
    byte* ptr1 = data + offset;
00000072  movsxd      rcx,dword ptr [rsp+28h] ; (1)
00000077  mov         rax,qword ptr [rsp+20h] 
0000007c  add         rax,rcx 
0000007f  mov         qword ptr [rsp+30h],rax 

The instruction (1) converts an unsigned int32 into a signed long with sign extension (bug or feature?). Therefore rcx contains 0xFFFFFFFFFFFF0000, while it should contain 0x00000000FFFF0000 for the addition to work properly.

And according to 64 bit arithmetic:

0xFFFFFFFFFFFF0000 +
0x00000000024539E0 =
0x00000000024439E0

The add overflows indeed.

I don't know if this is a bug or intended behavior, I'm going to check SSCLI before trying to give any conclusion. EDIT: See Ben Voigt's answer.

user703016
  • 37,307
  • 8
  • 87
  • 112