1

I am working on windows 10 system and using VS Code.

VS Code -v : 1.48.2
gcc version : gcc (MinGW.org GCC-8.2.0-5) 8.2.0

I am compiling a c program using "gcc program.c -o a" (program.c is the name of the C file). I store the output of a C file after typing the executable file name in the terminal as:

./a > outputFile
1
2
4
6
7

The numbers from the second line through sixth line are the inputs, outputFile is the new file where I intend to store my output which would otherwise be displayed on the VS Code terminal itself.

My problem is the output is being stored in UTF-16 LE format while my default encoding in VS code is set to UTF-8.

This causes the git diff command to identify the two files as different:

git diff outputFile expectedOutput

gives the message:

binary files a/outputFile b/expectedOutput differ

whereas:

fc.exe outputFile expectedOutput

gives the message:

Resync Failed.Files are too different.

I am required to manually change the encoding of my outputFile to UTF-8 after which the commands identify the files being identical. Is there a way to automatically store the output in UTF-8 format?
Reproducible Example:(Major Edit)

#include <stdio.h>
void main()
{
    printf("1 2 3 4\n");
}//this is a test program named test.c

If the VSCode Terminal which as pointed out by Bernard is Powershell by Default and the following sequence of commands is run:

gcc test.c
./a > out1.txt

The output file contains output in UTF-16LE format which can be seen by opening the text file in Notepad and trying SAVE-AS.
If one changes the terminal to cmd and runs the following command the output file stores output in UTF-8:

gcc test.c
a.exe > out2.txt

Same happens if the git bash terminal is chosen and following commands are executed the output is stored in UTF-8:

gcc test.c
./a > out3.txt

out1.txt contains output in UTF-16LE format and out2.txt,out3.txt contains output in UTF-8 format. Now I can change the default terminal and get rid of this issue. Methods to fix this in Powershell may be useful, though a solution is found as suggested by Bernard by changing default terminal.

  • 2
    It will be necessary to post the relevant code before any useful conclusions can be drawn. Here are some guidelines: [mcve] – ryyker Sep 03 '20 at 12:28
  • 1
    VSC settings have no influence on the data written by programs running in a terminal embedded in VSC – rioV8 Sep 03 '20 at 14:10
  • so @rioV8 is it dependent on compiler or what does the output encoding depend on? I have to submit my C code which will be checked using 'diff' command on linux by my instructor,can the encoding create any problems?How do I make my output be in UTF-8? – this_is_ash_shar Sep 03 '20 at 14:20
  • you talk about C code and you talk about the output file, what is the problem? The encoding of the C-output is determined by how you `fprintf` your strings. – rioV8 Sep 03 '20 at 14:32
  • @rioV8 the problem is the output file of the C code is generated in utf-16 le format which causes git diff command to identify the two files (expected output for test case and my output)as different whereas in notepad or any other text editor they are exactly the same.If I just change the encoding from utf-16 le to utf-8 format of the output file git diff identifies the two files being same. I am using printf function to print output to the terminal which I am storing in the file named outputFile – this_is_ash_shar Sep 03 '20 at 14:35
  • then convert your wide-char string to an UTF-8 byte string before writing – rioV8 Sep 03 '20 at 14:59
  • It is impossible to produce UTF-16 output without a very deliberate intent. If your program output is UTF-16, then you wrote your program this way, using `wchar_t` and `w` functions throughout. If you don't want that, don't use these things. If you didn't use `wchar_t` and `w` functions, then your program output is not in fact UTF-16, you are drawing this conclusion on a faulty premise. VSC is an editor. It has about as much influence on the output of your program as the make of your keyboard. – n. m. could be an AI Sep 03 '20 at 22:49
  • @n.'pronouns'm. Sorry but neither am I using wchar_t nor w functions.Moreover in my VS code project folder in which the new file outputFile is created VS Code indeed displays it's encoding when it is opened in the status bar as UTF-16 LE . Also if I open the file with notepad the encoding is depicted as UTF -16 LE there too. I resolve this by manually changing the encoding after clicking on the status bar's UTF-16 LE position and save it with UTF-8 after which there is no problem. I just need an automatic way to do this.P.S my default encoding is already set to UTF-8 in user settings – this_is_ash_shar Sep 04 '20 at 13:10
  • Your VS settings are irrelevant. Only your program source code is relevant. Post a [mcve]. – n. m. could be an AI Sep 04 '20 at 13:31
  • Despite the flak that this question is getting, I am having a similar issue with `g++ (MinGW-W64 x86_64-posix-seh, built by Brecht Sanders) 11.2.0`. I did `g++ --version` and `ld --version` both in the VSCode terminal (Powershell) and the standard windows console and got exactly the same output, but the windows console produced UTF-8 output while the VSCode terminal produced UTF-16 output. It does seem to be an issue with Powershell, because doing the compilation in a standalone Powershell terminal leads to UTF-16 output too. – Bernard Sep 28 '21 at 14:01
  • I had the same exact issue, if PowerShell Core is installed, change the default profile(by clicking on the "Select Default Profile" option) to use it instead. This changed default output file to UTF-8 for me. – SBI Feb 06 '22 at 14:30

1 Answers1

1

It appears to have something to do with Windows PowerShell, and VSCode on Windows uses a PowerShell terminal by default.

It isn't clear what the underlying reason for this issue is, but there is a simple workaround – just change the VSCode terminal to the standard Windows terminal instead of PowerShell:

Change terminal

Bernard
  • 5,209
  • 1
  • 34
  • 64