4

I try to draw text using a dll library which has only interfaces of ANSI version encapsulated windows ANSI apis, but I need to store string data using utf-8. I don't want to convert strings using MultiByte/WideChar functions so I want an approach to change the CP_ACP in my application, so that I can input string data into ANSI apis. thanks.

ps: I don't want to change the system default codepage.

Daniel Kamil Kozar
  • 18,476
  • 5
  • 50
  • 64
legendlee
  • 568
  • 4
  • 12

4 Answers4

7

Starting with Windows 10 v1903, you can use the application manifest to set the active code page for a given process, which might be different from the system-wide code page :

<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
  <assemblyIdentity type="win32" name="..." version="6.0.0.0"/>
  <application>
    <windowsSettings>
      <activeCodePage xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">UTF-8</activeCodePage>
    </windowsSettings>
  </application>
</assembly>

Obviously, if you need to support older versions of Windows, the application must still be aware that CP_ACP might not be CP_UTF8 and perform any necessary conversions itself.

More details can be found in Microsoft Docs.

Daniel Kamil Kozar
  • 18,476
  • 5
  • 50
  • 64
6

CP_ACP represents the system Ansi codepage. You cannot change that on a per-process or per-thread basis. It is a system-wide setting. If the DLL really is dependant on CP_ACP internally, then you have no choice but to convert your from/to UTF-8 whenever you interact with the DLL.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
1

UTF8 is not a codepage, and as codepages only make sense to ANSI functions, you can't do what you're asking.

If you want to store string as UTF8, you WILL need to convert from the ANSI of your app to unicode (wide char) using MultiByteToWideChar() then use WideCharToMultiByte() to convert to UTF8.

Alternatively, update you app to use unicode/wide strings internally, and convert as needed.

Deanna
  • 23,876
  • 7
  • 71
  • 156
  • 6
    UTF-8 itself is not a codepage, but Microsoft does have a codepage for UTF-8 - `65001` - for use with the `MultiByteToWideChar()` and `WideCharToMultiByte()` funtions (there are also codepages for UTF-7 and UTF-16 as well). – Remy Lebeau Jan 27 '12 at 19:51
  • 7
    And it works on SetConsoleCP() as well. Quacks like a code page, it is a code page. – Hans Passant Jan 28 '12 at 15:27
  • @HansPassant, it's misleading to tell people that `SetConsoleCP(65001)` works to set the console input codepage to UTF-8. We need to stress that this is deeply broken in the console host, which assumes single-byte codepages (or double-byte for East-Asian locales). If we try to read non-ASCII characters (i.e. 2-4 UTF-8 bytes) via `ReadFile` or `ReadConsoleA` prior to Windows 10, the read succeeds with 0 bytes read (looks like EOF). Starting in Windows 10, it succeeds with the non-ASCII characters replaced by NUL characters (`"\x00"`), which is borderline useless. – Eryk Sun Dec 17 '18 at 22:32
  • @HansPassant, `SetConsoleCP` is unrelated to whether `CP_ACP` can (or should) be changed for a process, as opposed to changing it for the system and requiring a reboot. Unlike the Windows API in general, the console API doesn't depend on the fast ANSI/OEM tables that are mapped into every process and referenced in the PEB as `AnsiCodePageData` and `OemCodePageData`. – Eryk Sun Dec 17 '18 at 23:05
0

"How to change the CP_ACP?" - "I don't (want) to change the system default codepage."

Well, you have to choose. CP_ACP is the system default codepage.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Thanks. Is there ant way to change code page for a single process? – legendlee Jan 27 '12 at 15:49
  • 1
    @legendlee: No, there is not. There is `SetThreadLocale()` and `SetThreadUILanguaage()`, but those operate on language IDs, not code pages. UTF-8 is not a language, it is a Unicode encoding. – Remy Lebeau Jan 27 '12 at 19:59