5

I’m building a Windows 10 universal application (phone + tablets) + libraries. In the solution I have C++ dll project that builds unmanaged my.dll that’s called from C#. The DLL has export like this:

// === C++ ===
typedef struct { int f1; uint32_t f2; } R;
// A and B are also structures.
MY_EXPORT R the_function( A *a, const B *b, const uint8_t *c );

// === C# ===
[DllImport( "my.dll", ExactSpelling = true, CallingConvention = CallingConvention.Cdecl )]
extern static R the_function(A a, B b, byte[] c);

[StructLayout( LayoutKind.Sequential )]
internal struct R
{
    public int f1;  // Actually a enum but it shouldn’t matter.
    public uint f2_id;
} 

internal struct A
{
    IntPtr nativePtr;
}

internal struct B
{
    IntPtr nativePtr;
}

The test app works on ARM and X64 platforms. It works on X86 if "Compile with .NET Native tool chain" is unchecked.

The unmanaged DLL crashes on X86 if "Compile with .NET Native tool chain" is checked, saying access violation. I can reproduce in both Debug and Release builds.

When using the debugger, I see there’s an error in how the arguments are passed. On the C# side, in some compiler-generated C# code there’s a line like this:

unsafe___value = global::McgInterop.my_PInvokes.the_function( a, b, unsafe_c );

In the debugger, I confirm the arguments are OK.

On C++ side, the values are wrong. The b's value is what was passed in a, the c's value is what was passed in b.

I tried to create a minimalistic example but failed, it works OK. my.dll exports 100+ exported __cdecl method, it's a large cross-platform C++ SDK I'm working on to bring to Windows 10 platform, looks like the rest of methods work OK.

Any ideas what's happening here? Or at least how do I isolate the issue? Thanks in advance.

Update: OK here's a minimal repro.

Unmanaged code:

typedef struct
{
    int f1;
    DWORD f2;
} R;

R __cdecl nativeBug( int a, int b )
{
    CStringA str;
    str.Format( "Unmanaged DLL: a = %i, b = %i\n", a, b );
    ::OutputDebugStringA( str );
    R res
    {
        11, 12
    };
    return res;
}

C# store app:

[StructLayout( LayoutKind.Sequential )]
struct R
{
    public int f1;
    public uint f2;
}

[DllImport( "NativeBugDll.dll", ExactSpelling = true, CallingConvention = CallingConvention.Cdecl )]
extern static R nativeBug( int a, int b );

private void Page_Loaded( object sender, RoutedEventArgs e )
{
    App.Current.UnhandledException += app_UnhandledException;
    R r = nativeBug( 1, 2 );
    Debug.WriteLine( "Result: f1={0}, f2={1}", r.f1, r.f2 );
}

private void app_UnhandledException( object sender, UnhandledExceptionEventArgs e )
{
    Debug.WriteLine( "Unhandled exception: " + e.Message );
}

Debug output without .NET Native is fine:

Unmanaged DLL: a = 1, b = 2
Result: f1=11, f2=12

And here's debug output with .NET Native build:

Unmanaged DLL: a = 91484652, b = 1
Unhandled exception: Object reference not set to an instance of an object.
STATUS_STACK_BUFFER_OVERRUN encountered

Then visual studio hangs completely.

The X64 build works fine even with .NET Native.

Soonts
  • 20,079
  • 9
  • 57
  • 130
  • Show the C++ code (at least the function signature) – Ben Voigt Dec 09 '15 at 20:18
  • IIRC, x86 is the only one of the three platforms you mention where the rules for stdcall and cdecl are different. – Ben Voigt Dec 09 '15 at 20:19
  • Please see the update, added the header. – Soonts Dec 09 '15 at 20:39
  • You’re right about calling conventions, on X86 there’re dozen, on ARM and X64 just one. What puzzles me is my X86 version works great unless the higher-level C# part is compiled by .NET native toolchain. BTW I’m working on an SDK that’s sold commercially, by default VS2015 compiles release builds with .NET native, I don’t expect paying customers will be happy they can’t compile with .NET native… – Soonts Dec 09 '15 at 20:47
  • Yes I completely understand the reason net native needs to work. – Ben Voigt Dec 09 '15 at 20:47
  • Does the problem go away if you return just Int32 instead of a structure? – Ben Voigt Dec 09 '15 at 20:49
  • Haven’t tried. Going to test tomorrow, will tell the result.. – Soonts Dec 09 '15 at 20:51
  • @BenVoigt Looks like the problem is indeed returning that structure from a DLL function. – Soonts Dec 09 '15 at 23:09
  • That makes a certain amount of sense: in some calling conventions, complex return values are placed on the stack adjacent to parameters, and a size disagreement would cause off-by errors accessing the parameters, even though the trouble isn't caused by parameters. In other calling conventions, an extra hidden parameter is passed with an address where the function should write the return value. Again, parameters aren't at fault, but the existence of an extra auto-generated parameter can affect positioning of all the rest. – Ben Voigt Dec 10 '15 at 19:45
  • @BenVoigt if I replace cdecl with stdcall the problem’s still here. If I replace the structure with int64 (which is also 8 bytes long) the problem goes away. Looks like the issue is how the .NET native compiler handles those complex return values of the [DllImport]’ed functions. – Soonts Dec 10 '15 at 19:51

2 Answers2

5

Yikes! Looks like there may be a bug in .NET Native. I've asked someone here at Microsoft to take a look. If you want to get hooked up with us internally feel free to mail us at dotnetnative@microsoft.com.

I'll update this as we know more.

EDIT: So there is definitely a real bug if a native function returns structs like this. The optimizer has ended up in a state where it pushes one extra argument to the stack after the two parameters and that’s what causing the bug.

I've opened a bug and we'll get this fixed for Update 2 of VS.

C#:

[StructLayout(LayoutKind.Sequential)]
struct R
{
   public int f1;
   public int f2;
    }

[DllImport("DllImport_NativeDll.dll")]
extern static R nativeBug(int a, int b);

public static void Run()
{
        R r = nativeBug(1, 2);
}

Native:

typedef struct
{
    int f1;
    int f2;
} R;

extern "C" __declspec(dllexport) R nativeBug(int a, int b)
{
    R res
    {
        11, 12
    };
    return res;
}

Code generated:

00f1766b 8b55fc          mov     edx,dword ptr [ebp-4]
00f1766e 52              push    edx
00f1766f 8b45f8          mov     eax,dword ptr [ebp-8]
00f17672 50              push    eax
00f17673 8d4ddc          lea     ecx,[ebp-24h]
00f17676 51              push    ecx <-- Bonus and unfortunate push
00f176ab ff1524b4d200    call    dword ptr [PInvokeAndCom!_imp__nativeBug (00d2b424)]
MattWhilden
  • 1,686
  • 13
  • 18
0
MY_EXPORT R the_function( A *a, const B *b, const uint8_t *c );

The first two arguments contain the addresses of the structs.

extern static R the_function(A a, B b, byte[] c);

Here you are passing the structs by value. Given the code in the question, that's the only difference that I can see. To pass addresses of the structs change the C# to:

extern static R the_function(ref A a, ref B b, byte[] c);
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • Those structures represent reference-counted objects with C-style API. I want C# code to treat those pointers as opaque. That’s why in C# code I store their raw pointers in another structures. I could have used just IntPtr type, but with those C# structures the compiler ensures the type safety, e.g. I can’t pass an instance of B into a_release(A a) call because it won’t compile. Please note my code compiles and works on ARM (all configurations), X64 (all configurations), and X86 (when .NET Native is turned off), that’s why I think I did that part correctly. – Soonts Dec 09 '15 at 21:38
  • Well, I wasn't able to guess those details. Why are you concealing so much? Do you want help? – David Heffernan Dec 09 '15 at 21:42
  • Sorry it looks like I’m concealing. It’s just a complex SDK I’m working on. My question is already two pages long, if I would tell all those technical details it would be much longer. – Soonts Dec 09 '15 at 21:45
  • Hmya, I suppose you were expected to know that you are looking at bogus code, can't work when jitted either. God forbid that he'll work on a minimal repro. – Hans Passant Dec 09 '15 at 21:59
  • @HansPassant I’m storing an unmanaged memory pointer in an IntPtr data type. The documentation for IntPtr says “A platform-specific type that is used to represent a pointer or a handle.” What’s bogus about that? Pretty much any unmanaged interop code does something similar. – Soonts Dec 09 '15 at 22:13
  • @Soonts Please just make a [mcve] and you'll get help. – David Heffernan Dec 09 '15 at 22:14
  • @DavidHeffernan When I’ve copy-pasted the same C++ function prototype (with all that structures) into a test DLL project with the same the_function in its *.def file, created a Store app that calls it in the same way, the .net native version worked OK. The issue could be threading (the real app is heavily multithreaded), or some subtle bug in C++ code (developed by other people, I only do the Windows wrapper for the pre-existing SDK), or something else. That’s why I’m asking “how do I isolate the issue?” – Soonts Dec 09 '15 at 22:21
  • I'm sure you'll find a way. Good luck. – David Heffernan Dec 09 '15 at 22:28
  • @DavidHeffernan You were right, thanks.. Will update in a minute. – Soonts Dec 09 '15 at 22:46