0

I'm reading a wstring from .txt file using a while !eof loop:

std::wifstream fileStream(path);
std::wstring input;
 while (fileStream.eof() == false) {
 getline(fileStream, input);
 text += input + L'\n';
}

But when i print it in wcout some characters get turned into other ones. So far č has turned to e(with a backwards comma ontop), ě to i(with a backwards comma ontop) and š to an error character. First i suspected some format issue. But when i write the string to a new .txt file it's completely fine.

Also i'm using _setmode(_fileno(stdout), _O_U8TEXT); to get wcout to even work.

G_glop
  • 63
  • 1
  • 8
  • Is your platform Windows? If so did you set the correct code page? If not does your terminal properly support Unicode? Did you look at the actual contents of the string in a debugger to see if they were correct? – Captain Obvlious Jun 01 '16 at 17:21
  • Yes. I don't know how to detect it from the .txt files i'm reading. Standard windows/VS terminal doesn't support unicode. And no, they weren't correct in the debugger. – G_glop Jun 01 '16 at 17:32

2 Answers2

0

Solved by reading the file as binary and then converting to wstring using MultiByteToWideChar function from win32 api:

std::ifstream fileStream(path, std::ios::binary | std::ios::ate);
auto size = fileStream.tellg();
fileStream.seekg(0, std::ios::beg);

LPCCH memory = new CCHAR[size];

fileStream.read((char*)memory, size);

text.resize(size);
MultiByteToWideChar(CP_ACP, MB_PRECOMPOSED, memory, size, (LPWSTR)text.c_str(), text.length());
delete[] memory;
G_glop
  • 63
  • 1
  • 8
-1

I don't know if this is the cause of your problem but...

If you write

 while (fileStream.eof() == false) {
 getline(fileStream, input);
 text += input + L'\n';
}

You read two times the last line because filestream.eof() is false until you try to read past the last line.

I suggest you something like

 while ( getline(fileStream, input) )
    text += input + L'\n';

p.s.: sorry for my bad English

max66
  • 65,235
  • 10
  • 71
  • 111