1

I'm writing a program that passes a const char* from my C++ dll into my C# code as a string. Certain characters don't pass the way I intend, which interferes with processing the string afterward.

For example, "ß.\x3" in C++ becomes "ß®\x3" when it reaches my C# program. In another case, "(\x2\x2" becomes "Ȩ\x2". I believe this may be a marshaling issue, but am not entirely sure.

Below is some relevant code:

C++ code:

typedef void (__stdcall * OnPlayerTextMessageReceivedCallback)(const char* entityId, const char* textMessage);

void
ProcessTextMessage(
    const std::string& sender,
    const std::string& message
    )
{
    m_onPlayerTextMessageReceivedCallback(sender.c_str(), message.c_str());
}

C# code:

private delegate void OnPlayerTextMessageReceivedCallback(
            [MarshalAs(UnmanagedType.LPStr)] string senderEntityId,
            [MarshalAs(UnmanagedType.LPStr)] string message
            );

I tried using marshaling the values with LPStr and LPWStr, but am still running into the same issues.

I appreciate any insight on what's happening here.

Abbas Aryanpour
  • 391
  • 3
  • 15
JonRicardo
  • 15
  • 5

1 Answers1

0

The c_str() function returns the plain pointer to the char data - that is not the problem. I assume both sides use different encodings. I would recommend to use utf-8. The dotnet marsheller converts the string by/to the default system code page for LPStr (e.g. cp1252) - not UTF8. Best would be to write it without magic dotnet marshalling.

Sample csharp code:

using System;

OnPlayerTextMessageReceivedCallback del = new Receiver().Receive;

//c++ emul
del("Hello".ToUtf8(), "World".ToUtf8());

public delegate void OnPlayerTextMessageReceivedCallback(
    IntPtr senderEntityId,
    IntPtr message
);

class Receiver
{
    public void Receive(IntPtr senderEntityId, IntPtr message)
    {
        Console.WriteLine(senderEntityId.FromUtf8());
        Console.WriteLine(message.FromUtf8());
    }
}

public static class Utf8Util
{
    public static unsafe string FromUtf8(this IntPtr p)
    {
        int len = 0;
        Span<byte> sourceBytes = new(p.ToPointer(), int.MaxValue);
        while (true)
        {
            var b = sourceBytes[len];
            if (b == 0)
            {
                break;
            }
            else
            {
                len++;
            }
        }
        sourceBytes = sourceBytes.Slice(0, len);
        return System.Text.Encoding.UTF8.GetString(sourceBytes);
    }

    public static unsafe IntPtr ToUtf8(this string s)
    {
        var data = System.Text.Encoding.UTF8.GetBytes(s);
        return new IntPtr(System.Runtime.CompilerServices.Unsafe.AsPointer(ref data[0]));
    }
}

In C++ you should use a single encoding for all strings e.g. utf8. No "default code page".

Todo so you can write in C++:

std::string myText = u8"This is a string in Utf8 encoding!";

For external string data you should convert it to your internal encoding.

Bernd
  • 2,113
  • 8
  • 22