0

I want to support UTF-8 string in my program, but the default active code page is 936.

Is there any methods to support UTF-8 without using the chcp 65001 command?

And the std::locale doesn't seem to work, it always throws an error when I'm using std::locale::global(std::locale("zh_CN.UTF-8"));:

terminate called after throwing an instance of 'std::runtime_error'
  what():  locale::facet::_S_create_c_locale name not valid

Example:

#include <iostream>
#include <codecvt>
#include <locale>

int main() {
  std::wstring a = L"中文字符串"; // a chinese string.

  using cvt = std::codecvt_utf8<wchar_t>;
  std::wstring_convert<cvt, wchar_t> converter;

  std::string x = converter.to_bytes(a);

  std::cout << x << std::endl;

  return 0;
}

Output without "chcp 65001":

涓枃瀛楃涓

Output with "chcp 65001":

Active code page: 65001
中文字符串

I don't want to use the "chcp 65001" command and don't want to see the prompt.

How can I solve the problem?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • What do you mean by 'support'? – Paul Sanders Aug 05 '22 at 09:15
  • to use the UTF-8 string – Nicholas Yang Aug 05 '22 at 09:18
  • Use it how? Please post some representative code, see [mre]. – Paul Sanders Aug 05 '22 at 10:27
  • 1
    sorry, i'm new. I'll edit it. – Nicholas Yang Aug 05 '22 at 10:42
  • No worries, take your time – Paul Sanders Aug 05 '22 at 10:42
  • I have finished editing the question – Nicholas Yang Aug 05 '22 at 10:47
  • Thanks. chcp is required, AFAIK, but I believe you can run it once, for any console session, and it will 'stick'. Also, you should be able to redirect its output (to NUL:) to get rid of the prompt. – Paul Sanders Aug 05 '22 at 10:51
  • Well, OK, there might be a way, but probably not an easy way. Check out the WIN32 console APIs. – Paul Sanders Aug 05 '22 at 10:53
  • Might be able to use `setlocale(LC_ALL, ".utf8");` (Windows 10 build 17134 (April 2018 Update) or later). [Universal C Runtime](https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page) support for UTF-8. – Eljay Aug 05 '22 at 12:08
  • @Eljay _think_ the issue is the code page being used by the console, not the runtime. But don't quote me. – Paul Sanders Aug 05 '22 at 13:33
  • @PaulSanders • could very well be. There are a lot of things that need to be set "just so" for UTF-8 to work end-to-end on Windows. To Microsoft's credit, they have been working to make UTF-8 a "first class citizen" option, rather than a barrier to entry. (And Microsoft did jump on the Unicode bandwagon as an early adopter, initially supporting UCS-2.) – Eljay Aug 05 '22 at 13:59
  • 1
    "*I don't want to use the "chcp 65001" command*" - use [`SetConsoleOutputCP()`](https://learn.microsoft.com/en-us/windows/console/setconsoleoutputcp) instead. – Remy Lebeau Aug 09 '22 at 00:37

0 Answers0