0

I am little bit new to unicode many other languages handle them in a very nice way internally. But in c we have something like wchar_t(2 byte unsigned short) and many more. char is 1 byte, wchar_t 2 byte and char32_t is 4. How can I read and write them on console and in file. My OS is windows and I found something like setlocale() and many other but can't find it helfull. I also need some ideas about conversion of different encodings and some links regarding it.

Nawal Kishore
  • 159
  • 1
  • 9
  • I'm fed up with compilers and platforms messing me about, so I _always_ represent Unicode strings as an array of `uint32_t` values. That's like UTF32, in a way, but without a byte order mark. There are some nice libraries for converting other encodings to and from UTF32 but, in fact, it's not hard to write the code for encodings I use all the time, like UTF8. Representing a string in an encoding like UTF8 (as GTK expects) is more compact, but manipulating the text is more difficult. I have heaps and heaps of code for manipulating strings as `uint32_t` arrays, which I use all the time. – Kevin Boone Sep 25 '20 at 17:42
  • See https://stackoverflow.com/a/64000948/235698 for an example of reading and writing Unicode from a Windows console, and writing to a file. – Mark Tolonen Sep 25 '20 at 21:10

0 Answers0