1

I want to use a library function that accepts a UTF16-LE string as (const char16_t* str, size_t length) parameter. The length parameter needs to be provided only for strings that are not null-terminated. The function will copy the string and work on the copy.

.Net's strings are held in memory as UTF16 - assuming they are stored in little endian encoding, one should be able to pin the string, get at the internal pointer and provide that to this library function without an extra copy, as the library will copy it anyway.

Is this possible, and under what circumstances would it be worth it performance-wise (pinning an object isn't free, I believe)?

(This is in the innermost loop of an app that basically copies tons of strings from A to B, and that copying part is the current bottleneck, so it would make sense to speed this up).

Edit This question is almost a duplicate, which is why I deleted it temporarily. But it's not quite a duplicate of that other question, which asks about accessing the data as const wchar_t*, whereas what I need is const char16_t*.

Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156
  • Do you want to call a function from c++ .dll from .Net code by passing a string to native dll? – Hasan Emrah Süngü Jan 30 '19 at 06:13
  • Possible duplicate of [Is it possible to get a pointer to String^'s internal array in C++/CLI?](https://stackoverflow.com/questions/3046137/is-it-possible-to-get-a-pointer-to-strings-internal-array-in-c-cli) – xMRi Jan 30 '19 at 10:08
  • @xMRi Almost. I had this question deleted, but then decided to undelete it, because your link converts to `const wchar_t*`, which means, we are not there yet, without a `reinterpret_cast`. – Evgeniy Berezovsky Jan 31 '19 at 02:21
  • As the data is stored internally as wchar_t there is no way to use it as a normal char* array, without copying it! You can use a char* but the you have always 2 bytes for each char... – xMRi Jan 31 '19 at 06:59
  • @xMRi Have you seen my answer? I'm actually using it that way. – Evgeniy Berezovsky Feb 01 '19 at 01:22
  • @EugeneBeresovsky This is no char* string. It is a unicode string interpreted as a char*. You can't pass this pointer to a function that uses "normal" char*! – xMRi Feb 01 '19 at 09:28
  • @xMRi I'm not sure where you get the idea I want to use a UTF16-LE string as a "normal" char* - I consistently refer to `char16_t*`. – Evgeniy Berezovsky Feb 03 '19 at 21:53

1 Answers1

-1

Yes, it can be done.

PtrToStringChars allows us to access the internal const wchar_t* without copying. All that's necessary now is a reinterpret_cast to const char16_t*:

void useStringInCLib(System::String^ s)
{
  pin_ptr<const wchar_t> wch = PtrToStringChars(s);
  const char16_t *chars = reinterpret_cast<const char16_t*>(wch);
  long charLen = s->Length;
  // long byteLen = charLen + charLen; // 2 byte wide characters
  use_it(chars, charLen);
}
Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156