I'm reading a file of thousands of non-English strings, many of them East Asian, using fgets
, and subsequently calling MultiByteToWideChar
to convert them back to Unicode:
WCHAR wstr[BUFSIZ] = { '\0' };
int result = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, src, -1, wstr, BUFSIZ);
This approach is working fine in nearly every case. The two strings for which it isn't working are:
我爱你 (read in by fgets as "我爱ä½")
コム (read in by fgets as "コãƒ")
In both cases, the call to MultiByteToWideChar
returns zero, and the final character of wstr
is garbage:
我爱� (final character xE4xBD)
コ� (final character xE3x83)
Is there some environmental set-up, or alternative manner of reading my text file, that would eliminate this problem?