0

For UTF-16, we can read and convert it to wchar at the same time. For example,

std::wifstream* file = new std::wifstream(name, ifstream::binary);
locale lo = locale(file->getloc(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::little_endian>);
file->imbue(lo);

How could I do the same for UTF-32 input?

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
nabla
  • 35
  • 5
  • from [std::codecvt_utf16](http://en.cppreference.com/w/cpp/locale/codecvt_utf16), you can use `std::codecvt_utf16` – Danh Oct 25 '16 at 15:34
  • 1
    Possible duplicate of [std::u32string conversion to/from std::string and std::u16string](http://stackoverflow.com/questions/31302506/stdu32string-conversion-to-from-stdstring-and-stdu16string) – Danh Oct 25 '16 at 15:34
  • @danh, perhaps not quite a duplicate - though I'm guessing you can just substitute `std::codecvt_utf16` into the `imbue()`? (I don't really know; I only ever need UTF-8 myself) – Toby Speight Oct 25 '16 at 16:06
  • Is this in Windows? What compiler/version do you use? – Barmak Shemirani Nov 01 '16 at 10:18

1 Answers1

2

You may want to use a classic C++ pattern of allocating wifstream on the stack instead of the heap (new):

std::wifstream* file = new std::wifstream(name, ifstream::binary);
std::wifstream file(name, ifstream::binary);

For the codecvt part, I'd try with std::codecvt_utf16<char32_t>.

P.S. Note that wchar_t can have different sizes (16 bits, 32 bits) on different platforms. So it may be better for you to use std::u16string for UTF-16 and std::u32string for UTF-32.

Mr.C64
  • 41,637
  • 14
  • 86
  • 162
  • The codecvt std::codecvt_utf16 doesn't work. I think I need something like std::codecvt_utf32, which doesn't exist. Can I create my own? Thanks. – nabla Oct 27 '16 at 15:07