2

We are using a tool that uses internally the simple Win-Api call WritePrivateProfileString.

I'm aware of the define flag UNICODE to map either to WritePrivateProfileString**A** or WritePrivateProfileString**W**

I am writing something to the INI file, that file doesn't exist before.

And it behaves differently on some systems. Why?

For example: a character "§" which is A7 (hex) in ASCI, is sometimes written as Unicode format C2 A7 (hex). but only on some systems, and I don't know WHY?! What is the system-condition for writing ANSI or UNICODE?

I was trying to create the file first, before writing to it and even tried to define the format, by adding some characters already, cause I thought WritePrivateProfileString is using isTextUnicode internally, but no chance here.

Does anybody understand this API-Documentation in the right way:
https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-writeprivateprofilestringa

If the file was created using Unicode characters, the function writes Unicode characters to the file. Otherwise, the function writes ANSI characters.

I can not really agree to this documentation here. I know I must be wrong here somehow ;-) Or how to do that right?

All that we want is to write in ANY case to ANY system just plain ANSI to the INI-file. (I can't change it to the method WritePrivateProfileString**A** cause we are using a tool that just uses the WritePrivateProfileString function internally.) Anyhow, it works for 90% of the PC's out there correct, but on some we have still unicode letters in the INI file.

I also know ASCI is not state-of-the-art, but we are performing some CRC-calculation of that INI values and "A7" is not "C2 A7", which leads to a miss-calculation, that's the background why we need plain ASCII format.

Thank you for any help.

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
Tony
  • 360
  • 2
  • 11
  • What makes you sure that WritePrivateProfileString is at fault here? Maybe the tool is doing the ASCII to UTF-8 conversion. – john Sep 18 '20 at 16:29
  • The code unit `0xA7` isn't part of ASCII encoding. Whatever your encoding is, it's not ASCII. That said, the `*PrivateProfile*` APIs are indeed funky when used with anything that isn't ASCII, or whatever encoding and existing INI file is. Remnants of 16-bit Windows. Time to move on. TOML is an INI-like file format with a robust specification. – IInspectable Sep 18 '20 at 16:41
  • Sounds like you need to better control the encoding of files you work with. – Asteroids With Wings Sep 18 '20 at 17:01
  • Just make sure the target file you're writing to doesn't have any BOM (I guess that's what they mean by "was created using Unicode characters") https://en.wikipedia.org/wiki/Byte_order_mark and that should work fine. http://archives.miloush.net/michkap/archive/2006/09/15/754992.html – Simon Mourier Sep 18 '20 at 17:40
  • 1
    In any case you should be using an ini file library. These APIs aren't up to the job these days. – David Heffernan Sep 18 '20 at 18:08

2 Answers2

2

I found the root cause:
https://en.wikipedia.org/wiki/Unicode_in_Microsoft_Windows

In April 2018 with insider build 17035 (nominal build 17134) for Windows 10, a "Beta: Use Unicode UTF-8 for worldwide language support" checkbox appeared for setting the locale code page to UTF-8.[a] This allows for calling "narrow" functions, including fopen and SetWindowTextA, with UTF-8 strings. In May 2019 Microsoft added the ability for a program to set the code page to UTF-8 itself, and started recommending that all software do this and use UTF-8 exclusively.

Settings for Unicode Beta function

This setting was activated on some systems and it was responsible for changing my expected ANSI code to UNICODE in the INI-file.

Yes, I understand that we have to go in the UNICODE direction in the future. But now also Microsoft is pushing us to this direction, which is not bad, but I was not aware of this setting or strategy before.

@IInspectable: yes you are right 0xA7 is not ASCII but ANSI (or ASCII-8, Extended ASCII) I had mixed it up, thank you.

Tony
  • 360
  • 2
  • 11
  • "I understand that we have to go in the UNICODE direction in the future." - Win32 has Unicode support since 1993. Real native Unicode (16-bit chars) not patches like UTF-8. The above option is like step back to Unicode-over-ASCII. – i486 Dec 21 '21 at 16:30
1
  1. Always use WritePrivateProfileStringW version.

     [DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true,
     EntryPoint = "WritePrivateProfileStringW")]
     [return: MarshalAs(UnmanagedType.Bool)]
     public static extern bool WritePrivateProfileString(string lpAppName, string lpKeyName, string lpString, string lpFileName);
    
  2. When the INI files does not exist, create it with Unicode file marker (first two bytes FF FE). By this way all strings are stored as Unicode. To create a blank ini file you can use this:

     File.WriteAllText(ini_file_path, "; Your comment\r\n", Encoding.Unicode);
    

A blank/comment line at the beginning is recommended. Write at least "\r\n".

i486
  • 6,491
  • 4
  • 24
  • 41
  • You are not wrong, but the question does not mention anywhere, that they are working with C#. And you'll need the blank commend if there is a discrepancy with BOM-byte at the beginning. I think it should not contain any for the INI to be parsed correctly? For this reason, it is better if you create your INI file with `WritePrivateProfileSectionW` instead; for each and every section you need in your INI file. – Adam L. S. Oct 21 '22 at 14:35
  • 1
    @AdamL.S. You are right for C# - probably I didn't pay attention. For the leading comment line - it is protection in case INI file has BOM (if edited with Notepad or other editor which will add BOM). Comment is ignored and does not hurt to have it. The problem is that `GetPrivateProfileString` cannot find first section if BOM exists and section starts at first line. – i486 Oct 21 '22 at 14:59