2

In Visual Studio 2022, using [tag:C++ 17]. I am trying to use a std::map to store Chinese strings:

std::map<std::string, std::string> translation;
translation["Type"] = "类型";

After inserting the values, the string inside translation is question mark (??)

I tried using:

#pragma execution_character_set("utf-8")

Then the string inside translation becomes some garbage value.

How to store the Chinese strings correctly? Kindly help.

Kesto2
  • 70
  • 1
  • 13
keen
  • 83
  • 7
  • How did you check what value is inside string after inserting? – Kesto2 Jul 09 '23 at 12:39
  • 1
    I put a breakpoint, and see the data inside 'translation' – keen Jul 09 '23 at 14:53
  • Note: console and terminals may have different opinion on encodings. If you sources are UTF-8, then code should work well. If you want to debug, first write to a file (to reduce more tricky encoding problems). When you have correct output on the file, you can start to check how to setup your console to understand UTF-8 (if you are using such encoding for cout. – Giacomo Catenazzi Jul 10 '23 at 11:54

1 Answers1

2

The pragma is not what you want. It just ask the compiler to encode the string in utf-8 in the executable. But before encoding it in the executable the string has to exist in the source file. Displaying it on the screen is not a problem, because Windows natively uses 16 bits characters and it can easily displays Chinese characters.

But the source file encoding is normaly a simply 8bits characters encoding. In my French system, it uses by default the 1252 code page which is a slight variation of ISO-8859-1 character set. And it is not possible to encode Chinese characters in this encoding, so VisualStudio replaces the offending characters with question marks at the c++ file level.

So this only reliable way is to ask VisualStudio to use a different encoding when saving the file. Normaly, it should have asked you at first saving of the file, but you can force it to rewrite the file with a different encoding with File / Save as. The Save button can then be toggled to Save with encoding.... You can now use the native UTF16 little-endian (1200) or utf-8 (65001) to successfully write the Chinese characters into your source file. And you will then get the correct utf8 encoding in your executable.

But in order to be able to display them at run-time, you will of course have to use a GUI application (natively unicode enabled) or install an acceptable code page in your Windows console for a console application...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252