In a C program I'm using wprintf to print Unicode (UTF-16) text in a Windows console. This works fine, but when the output of the program is redirected to a log file, the log file has a corrupted UTF-16 encoding. When redirection is done in a Windows Command Prompt, all line breaks are encoded as a narrow ASCII line break (0d0a). When redirection is done in PowerShell, null characters are inserted.
Is it possible to redirect the output to a proper UTF-16 log file?
Example program:
#include <stdio.h>
#include <windows.h>
#include <fcntl.h>
#include <io.h>
int main () {
int prevmode;
prevmode = _setmode(_fileno(stdout), _O_U16TEXT);
fwprintf(stdout,L"one\n");
fwprintf(stdout,L"two\n");
fwprintf(stdout,L"three\n");
_setmode(_fileno(stdout), prevmode);
return 0;
}
Redirecting the output in Command Prompt. See the 0d0a which should be 0d00 0a00:
c:\test>.\testu16.exe > o.txt
c:\test>xxd o.txt
0000000: 6f00 6e00 6500 0d0a 0074 0077 006f 000d o.n.e....t.w.o..
0000010: 0a00 7400 6800 7200 6500 6500 0d0a 00 ..t.h.r.e.e....
Redirecting the output in PowerShell. See all the 0000 inserted.
PS C:\test> .\testu16.exe > p.txt
PS C:\test> xxd p.txt
0000000: fffe 6f00 0000 6e00 0000 6500 0000 0d00 ..o...n...e.....
0000010: 0a00 0000 7400 0000 7700 0000 6f00 0000 ....t...w...o...
0000020: 0d00 0a00 0000 7400 0000 6800 0000 7200 ......t...h...r.
0000030: 0000 6500 0000 6500 0000 0d00 0a00 0000 ..e...e.........
0000040: 0d00 0a00 ....