2

I am trying to marshal a hid_device_info struct in C#, but I can't figure out how to translate the wchar_t* strings to managed C# strings. I have tried all possible values in the MarshalAs attribute, but all of them returned the first character only and nothing else.

I have tried replacing all the wide strings with pointers so I can manually look at them, this is the struct that I have so far:

public struct HidDeviceInfo
{
    public IntPtr path; // This one marshals fine because it's just a regular char_t*
    public ushort vendor_id;
    public ushort product_id;
    public IntPtr serial_number; // wchar_t*
    public ushort release_number;
    public IntPtr manufacturer_string; // wchar_t*
    public IntPtr product_string; // wchar_t*
    public ushort usage_page;
    public ushort usage;
    public int interface_number;
    public IntPtr next;
}

When I manually iterate through one of the pointers (serial_number for example), I can see that all the characters have 4 bytes (1 ascii byte followed by 3 zeros). I have tried all the possible Marshal.PtrToString... methods, but none of them are able to retrieve the full string.

I have a suspicion that the strings are being treated as 2 byte characters since I can't specify the character width anywhere in C#, and this is why it stops after the first character. Of course, by knowing this, I could easily write my own string marshaler, but I feel like there must be an existing solution and I'm overlooking something obvious.

This struct is coming from a P/Invoked function and Marshal.PtrToStructure:

[DllImport(LibUsbName, CharSet = CharSet.Unicode)]
public static extern IntPtr hid_enumerate(ushort vendorId, ushort productId);

I've also tried all the possible CharSet values.

This can't be a character type mismatch, as it was in this question, because I've tried all possible combinations of different character types.

Lázár Zsolt
  • 685
  • 8
  • 29
  • `CharSet.Unicode` is for UTF-16 (UCS-2), `CharSet.Ansi` is for those legacy 1-byte encodings (Windows-1252 etc.), and `CharSet.Auto` automatically chooses *between those two*. This is for interop with the Windows API `...A` and `...W` methods. As far as I know, there is nothing built-in for UCS-4 (or even UTF-8, for that matter). I'll be happy to be proven wrong, though. – Heinzi Oct 03 '20 at 17:40
  • The problem is that the actual size of wchar_t is platform-dependent. I don't know how to write something that is guaranteed to work on all platforms. – Lázár Zsolt Oct 03 '20 at 17:49
  • I guess a better question would be How do I find the size of wchar_t in C#? Is it somehow related to IntPtr.Size? – Lázár Zsolt Oct 03 '20 at 17:51
  • Which operating systems do you target? – Heinzi Oct 03 '20 at 18:30
  • I'm currently developing this on Linux, but it would be nice if it would also work on Windows. I wrote a method that works, but it assumes that the width is exactly 4 bytes and all characters are ASCII (it's fine for my use case). Gonna post it as answer in a second. – Lázár Zsolt Oct 03 '20 at 18:36

1 Answers1

1

I ended up writing this method that works fine for me, but only if all character are ASCII and the char width is guaranteed to be 4 bytes.

private static string ToUcs4String(this IntPtr ptr)
{
    var builder = new StringBuilder();
    var buffer = new byte[4];
    while (true)
    {
        Marshal.Copy(ptr, buffer, 0, 4);
        if (buffer[0] == 0)
            break;
        builder.Append((char) buffer[0]);
        ptr += 4;
    }

    return builder.ToString();
}
Lázár Zsolt
  • 685
  • 8
  • 29