0

I've got a Unicode string containing four Japanese characters and I'm using WideCharToMultiByte to convert it to a multi-byte string specifying the Shift-JIS codepage of 932. In order to get the size of the required buffer I'm calling WideCharToMultiByte first with the cbMultiByte parameter set to 0. This is returning 9 as expected, but then when I actually call WideCharToMultiByte again to do the conversion it's returning the number of bytes written as 13. An example is below, I'm currently hard coding my buffer size to 100:

BSTR value = SysAllocString(L"日経先物");
char *buffer = new char[100];

int sizeRequired = WideCharToMultiByte(932, 0, value, -1, NULL, 0, NULL, NULL);

// sizeRequired is 9 as expected

int bytesWritten = WideCharToMultiByte(932, 0, value, sizeRequired, buffer, 100, NULL, NULL);

// bytesWritten is 13

buffer[8] contains the string terminator \0 as expected. buffer[9-12] contains byte 63.

So if I set the size of my buffer to be sizeRequired it's too small and the second call to WideCharToMultiByte fails. Does anyone know why an extra 4 bytes are written each with a byte value of 63?

nblackburn
  • 338
  • 2
  • 10

1 Answers1

4

You are passing the wrong arguments to WideCharToMultiByte in your second call (the required size of the destination as the length of the source). You need to change

int bytesWritten = WideCharToMultiByte(932, 0, value, sizeRequired, buffer, 100,
                                       NULL, NULL);

to

int bytesWritten = WideCharToMultiByte(932, 0, value, -1, buffer, sizeRequired,
                                       NULL, NULL);
IInspectable
  • 46,945
  • 8
  • 85
  • 181
  • It would also make sense to not allocate the `char[]` buffer until after the first call has calculated the size, for the `char` count so you don't waste unused memory unnecessarily: `char *buffer; int sizeRequired = ...; buffer = new char[sizeRequired]; int bytesWritten = ...; ... delete[] buffer;` Also, a `BSTR` needs to be freed with `SysFreeString()`, but since the `BSTR` is being allocated from a string literal, it would make more sense to have `value` just point at the literal directly instead: `LPCWSTR value = L"日経先物";` – Remy Lebeau May 20 '16 at 01:26