For example I need codepoint
of 5th character here, that is ð
const WCHAR* mystring = L"Þátíð";
I know that it has code point : U+00F0
- but how to get this integer using c++ ?
For example I need codepoint
of 5th character here, that is ð
const WCHAR* mystring = L"Þátíð";
I know that it has code point : U+00F0
- but how to get this integer using c++ ?
WCHAR in Windows 2000 and later is UTF-16LE so it is not necessarily safe to access a specific character in a string by index. You should use something like CharNext to walk the string to get correct handling of surrogate pairs and combining characters/diacritics.
In this specific example Forgottn's answer depends on the compiler emitting precomposed versions of the á
and í
characters... (This is probably true for most Windows compilers, porting to Mac OS is probably problematic)
const WCHAR myString[] = L"Þátíð";
size_t myStringLength = 0;
if(SUCCEEDED(StringCchLengthW(myString, STRSAFE_MAX_CCH, &myStringLength))
{
LPCWSTR myStringIterator = myString;
for(size_t sz = 0; sz < myStringLength; ++sz)
{
unsigned int mySuperSecretUnicodeCharacter = *myStringIterator;
LPCWSTR myNextIterator = CharNext(myStringIterator);
std::vector<unsigned int> diacriticsOfMySuperSecretUnicodeCharacter(myStringIterator+1, myNextIterator);
myStringIterator = myNextIterator;
}
}
Edit 1: made it actually work
Edit 2: made it actually look for all codepoints