Starting from C++11 one can convert UTF8 to UTF16 wchar_t
(at least on Windows, where wchar_t
is 16 bit wide) using std::codecvt_utf8_utf16
:
std::wstring utf8ToWide( const char* utf8 )
{
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
return converter.from_bytes( utf8 );
}
Unfortunately in C++17, std::codecvt_utf8_utf16
is deprecated. But there is std::filesystem::path
with all possible conversions inside, e.g. it has members
std::string string() const;
std::wstring wstring() const;
std::u8string u8string() const;
std::u16string u16string() const;
std::u32string u32string() const;
So the above function can be rewritten as follows:
std::wstring utf8ToWide( const char* utf8 )
{
return std::filesystem::path( (const char8_t*) utf8 ).wstring();
}
And unlike std::codecvt_utf8_utf16
this will not use any deprecated piece of C++.
What kind of drawbacks can be expected from such converter? For example, path cannot be longer than certain length or certain Unicode symbols are prohibited there?