2

I'm writing a small zlib wrapper via P/Invoke calls. It runs perfectly on a 64-bit target (64-bit C# build, 64-bit DLL), but throws an AccessViolationException on a 32-bit target (32-bit C# build, 32-bit DLL).

Here's the C# signature and code which throws the exception:

[DllImport(Program.UnmanagedDll, CallingConvention = CallingConvention.Cdecl)]
private static extern ZLibResult ZLibDecompress(byte[] inStream, uint inLength, byte[] outStream, ref uint outLength);

internal enum ZLibResult : byte {
        Success = 0,
        Failure = 1,
        InvalidLevel = 2,
        InputTooShort = 3
}

internal static ZLibResult Decompress(byte[] compressed, out byte[] data, uint dataLength) {
    var len = (uint) compressed.Length;
    fixed (byte* c = compressed) {
        var buffer = new byte[dataLength];
        ZLibResult result;
        fixed (byte* b = buffer) {
            result = ZLibDecompress(c, len, b, &dataLength);
        }
        if(result == ZLibResult.Success) {
            data = buffer;
            return result;
        }
        data = null;
        return result;
    }
}

And here's the C code (compiled with MinGW-w64):

#include <stdint.h>
#include "zlib.h"

#define ZLibCompressSuccess         0
#define ZLibCompressFailure         1

__cdecl __declspec(dllexport) uint8_t ZLibDecompress(uint8_t* inStream, uint32_t inLength,
                                                     uint8_t* outStream, uint32_t* outLength)
{
    uLongf oL = (uLongf)*outLength;
    int result = uncompress(outStream, &oL, inStream, inLength);
    *outLength = (uint32_t)oL;
    if(result == Z_OK)
        return ZLibCompressSuccess;
    return ZLibCompressFailure;
}

I've looked over everything and can't figure out why an access violation would be happening on a 32-bit build and not on a 64-bit build. ZLibDecompress works fine decompressing the same stream when called from a C app, but throws an access violation when called from my C# app.

Does anyone know why this could be happening?

EDIT: Updated my code, still getting an access violation on 32-bit builds, but not 64-bit.

C# Code:

[DllImport(Program.UnmanagedDll, CallingConvention = CallingConvention.Cdecl)]
private static extern ZLibResult ZLibDecompress(
    [MarshalAs(UnmanagedType.LPArray)]byte[] inStream, uint inLength,
    [MarshalAs(UnmanagedType.LPArray)]byte[] outStream, ref uint outLength);

internal static ZLibResult Decompress(byte[] compressed, out byte[] data, uint dataLength) {
    var buffer = new byte[dataLength];
    var result = ZLibDecompress(compressed, (uint)compressed.Length, buffer, ref dataLength);
    if(result == ZLibResult.Success) {
        data = buffer;
        return result;
    }
    data = null;
    return result;
}

C Code:

__declspec(dllexport) uint8_t __cdecl ZLibDecompress(uint8_t* inStream, uint32_t inLength,
                                 uint8_t* outStream, uint32_t* outLength) {
    uLongf oL = (uLongf)*outLength;
    int result = uncompress(outStream, &oL, inStream, inLength);
    *outLength = (uint32_t)oL;
    if(result == Z_OK)
        return ZLibCompressSuccess;
    return ZLibCompressFailure;
}
Caleb Joseph
  • 418
  • 2
  • 9
  • Are you using 32-bit MinGW or are you trying to cross-compile? Have you tried using your library from C code? – Mario Jun 30 '12 at 08:04
  • I'm building a 32-bit DLL using MinGW-w64, and yes, the functions works fine without heap corruption when called in a C program. – Caleb Joseph Jun 30 '12 at 08:21
  • Can't see any issue right now. Have you tried using 32-bit MinGW? Although I wouldn't expect any difference. You're missing the `unsafe` keyword, but other than that... you might actually be able to skip the whole pointer stuff and just pass the arrays I think (making `unsafe` obsolete) - might exclude another possible error source. – Mario Jun 30 '12 at 08:40
  • I am using a 32-bit MinGW, the MinGW-w64 project offers both 32-bit and 64-bit compilers. Also the class containing Decompress() is unsafe, so that couldn't be the issue. I'd prefer not to use .NET arrays as it adds an unnecessary marshalling overhead. As I said, it works with a 64-bit build but not 32-bit, which is very strange considering I'm not using any platform-specific data types. – Caleb Joseph Jun 30 '12 at 08:47

3 Answers3

4
    fixed (byte* b = buffer) {
        result = ZLibDecompress(c, len, b, &dataLength);
    }

No, that can't work. The fixed keyword provides a highly optimized way to ensure that the garbage collector moving objects doesn't cause trouble. It doesn't do it by pinning the object (like the documentation says), it does it by exposing the b variable to the garbage collector. Which then sees it referencing the buffer and updates the value of b when it moves buffer.

That however can't work in this case, a copy of the b value was passed to ZlibDecompress(). The garbage collector cannot update that copy. The outcome will be poor when a GC occurs while ZLibDecompress() is running, the native code will destroy the integrity of the garbage collected heap and that will eventually cause an AV.

You cannot use fixed, you must use GCHandle.Alloc() to pin the buffer.

But don't do that either, you are helping too much. The pinvoke marshaller is already very good at pinning objects when necessary. Declare the instream and outstream arguments as byte[] instead of byte*. And pass the arrays directly without doing anything special. Also, the outlength argument should be declared ref int.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Thanks for the insight, I'll remember that fixed doesn't pin the object. I've tried passing the arrays directly, and using GCHandle.Alloc(), but I'm still getting a 32-bit specific access violation. Again, the native function works fine when called from a C program, I am very confused as to why this isn't working now. – Caleb Joseph Jun 30 '12 at 10:20
  • Pay attention to the last paragraph in the post. – Hans Passant Jun 30 '12 at 10:24
  • You removed CallingConvention, don't. – Hans Passant Jun 30 '12 at 10:27
  • Added it back, same result. Shouldn't StdCall be assumed by default anyway? – Caleb Joseph Jun 30 '12 at 10:31
  • You added the wrong CallingConvention, it is Cdecl. Have you considered using somebody else's library to get this done? SharpZipLib and DotNetZip can do this too. – Hans Passant Jun 30 '12 at 10:43
  • Same story with Cdecl. I know those libraries exist, but I cannot comprehend why something as simple as a P/Invoke wrapper is only working with 64-bit targets. – Caleb Joseph Jun 30 '12 at 10:58
1

In 64-bit there's only one ABI for Windows (no cdecl/stdcall), so the problem for 32-bit seems to be with the calling conventions. Your parameter pointers are going into wrong registers and the native function accesses the wrong memory region.

To resolve the issue:

  1. Try commenting out the lines in the native function (see if it crashes - it yes, it's not the calling convention)

  2. Try playing with the calling conventions "cdecl/stdcall"

  3. To check everything, try dumping the pointer values and see if they coincide in native/managed functions.

EDIT:

Then it is a problem with pointers. You are allocating the arrays in C# (thus they reside in a managed heap). You have to marshal them using the "[MarshalAs(UnmanagedType.LPArray)]" attribute.

[DllImport(Program.UnmanagedDll, CallingConvention = CallingConvention.Cdecl)]
private static extern ZLibResult ZLibDecompress(
     [MarshalAs(UnmanagedType.LPArray)] byte[] inStream,
     uint inLength,
     [MarshalAs(UnmanagedType.LPArray)] byte[] outStream,
     ref UInt32 outLength);

The [In,Out] modifier might be of help also.

And yes, as Hans says, pin the pointers and do not allow them to be garbage-collected.

byte[] theStream = new byte[whateveyouneed];
// Pin down the byte array
GCHandle handle = GCHandle.Alloc(theStream, GCHandleType.Pinned); 
IntPtr address = handle.AddrOfPinnedObject();

and then pass it as IntPtr.

Viktor Latypov
  • 14,289
  • 3
  • 40
  • 55
  • Thanks, never thought of using stdcall. Is there anyway to disable name mangling of exported __stdcall functions? I know the Windows API uses stdcall and doesn't have mangled function names. – Caleb Joseph Jun 30 '12 at 09:07
  • Name mangling is only relevant for the C++ code. extern "C" before the function's name disables the mangling. But I think you know that. – Viktor Latypov Jun 30 '12 at 09:15
  • Nevermind about the name mangling, fixed it by passing --kill-at to the MinGW linker. I'm still getting an access violation with either stdcall or cdecl, the pointers for inStream and outStream are exactly the same in managed and native code, and commenting out the line that calls uncompress() in the C code stops the crash. Any further ideas? – Caleb Joseph Jun 30 '12 at 09:26
  • Then I suppose we should check the marshalling attributes. Looks like the byte* points to the managed heap. See my edit. – Viktor Latypov Jun 30 '12 at 09:46
  • Tried both of those examples, and they both still give an access violation on 32-bit. Should I just give up? – Caleb Joseph Jun 30 '12 at 10:36
0

The actual issue was caused by MinGW-w64 generating a buggy DLL. I had been passing -ftree-vectorize to gcc when building zlib, which was generating code that the 32-bit CLR didn't like. The code ran fine after using less aggressive optimization options.

Caleb Joseph
  • 418
  • 2
  • 9