cstring m_pszdata doesn't match converted char* in UNICODE

Question

I tested the Unicode conversion with a UNICODE MFC dialog app, where I can input some Chinese in the edit box. After reading in the characters using

DDX_Text(pDX, IDC_EDIT1, m_strUnicode) UpdateDate(TRUE)

the m_pszdata of m_strUnicode shows "e0 65 2d 4e 1f 75 09 67". Then I used the following code to convert it to char*:

char *psText; psText = new char[dwMinSize]; WideCharToMultiByte (CP_OEMCP, NULL, m_strUnicode,-1, psText, dwMinSize, NULL, FALSE);

The psText contains "ce de d6 d0 c9 fa d3 d0", nothing similar with the m_pszdata of m_strUnicode. Would anyone please explain why it is like that?

dda · Accepted Answer · 2012-06-05T14:14:22.413

0

ce de d6 d0 c9 fa d3 d0 is 无中生有 in GBK. You sure you're manipulating Unicode?

CP_OEMCP instructs the API to use the currently set default OEM codepage.

So my guess here is that you're on a Chinese PC with GBK as default codepage.

无中生有 in UTF16LE is e0 65 2d 4e 1f 75 09 67 so basically you are converting a UTF-16-LE string to GBK.

edited Jun 05 '12 at 14:14

answered Jun 05 '12 at 11:37

dda

6,030
2
25
34

I selected Unicode in MFC project setting when starting the project and I did input 无中生有 to test, assuming Chinese characters are in Unicode format. Was I wrong about this? – LSSG Jun 05 '12 at 13:38
Got it. Thanks. I tested again using English letters abcdef and it worked as expected. Am I right to say that whatever the string I input, it will be in UTF-16-LE if UNICODE is used, and after the conversion to multibyte, it will be in the default codepage string? – LSSG Jun 06 '12 at 01:18

cstring m_pszdata doesn't match converted char* in UNICODE

1 Answers1