1

I am trying to print Unicode characters in C++. My Unicode characters are Old Turkic, I have the font. When I use a letter's code it gives me another characters. For example:

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string str = "\u10C00" // My character's unicode code.
    cout << str << endl;
    return 0;
}

This snipped gives an output of another letter with a 0 just after its end. For example, it gives me this (lets assume that I want to print 'Ö' letter): A0

But when I copied and pasted my actual letter to my source snippet, from character-map application in ubuntu, it gives me what I want. What is the problem here? I mean, I want use the character code way "\u10C00", but it doesn't work properly. I think this string is too long, so it uses the first 6 characters and pops out the 0 at the end. How can I fix this?

Bora Semiz
  • 223
  • 2
  • 14

2 Answers2

2

std::string does not really support unicode, use std::wstring instead. but even std::wstring could have problems since it does not support all sizes.

an alternative would be to use some external string class such as Glib::ustring if you use gtkmm or QString in case of Qt.

Almost each GUI toolkit and other libraries provide it's own string class to handle unicode.

codekiddy
  • 5,897
  • 9
  • 50
  • 80
  • Does `std::wstring` supports Private Use Are? Because if I could succeed my trials, I will use that area, since I have a project which involves multiple ancient language. – Bora Semiz Mar 05 '15 at 00:19
  • I'm not sure about that, see this for example: http://stackoverflow.com/questions/19193429/why-are-certain-unicode-characters-causing-stdwcout-to-fail-in-a-console-app – codekiddy Mar 05 '15 at 00:22
2

After escape /u must be exactly 4 hexadecimal characters. If you need more, you should use /U. The second variant takes 8 characters.

Example:

"\u00D6"      // 'Ö' letter
"\u10C00"     // incorrect escape code!
"\U00010C00"  // your character
Piotr Siupa
  • 3,929
  • 2
  • 29
  • 65