I have a function, which gets a wchar_t*
as input. Now in this function I have to process code points.
Since this program should run on multiple platforms, I have very little knowledge about the encoding in the wchar_t*
.
I tried to get a conversion from wchar_t*
to char32_t*
via std::codecvt<char32_t, wchar_t, std::mbstate_t>
. Sadly, this specialization does not seem to exist.
Then I thought that I might perhaps be able to simply use the wchar_t*
as a readonly input buffer to icu::UnicodeString
, but it seems I first have to convert to UChar*
via u_strFromWCS
. But then again I first need to allocate a UChar
buffer, with the correct amount of codeunits in UChar
.
Can someone tell me what the most effective way of accessing code points in a wchar_t*
is?
Example:
If I am not mistaken, the following example should make use of two code units per code point.
const wchar_t *test = L"A 剝Ц B";