0

In Windows NT the Windows console can handle Unicode, but, by default, when output is redirected from a command to a file it is converted to the current ANSI codepage, meaning that any characters outside the codepage will be converted to "?" characters.

This can be corrected by starting the console with the /u switch.

With my program running inside of a console, how can I detect whether the /u switch was used when the console was started up?

blueberryfields
  • 45,910
  • 28
  • 89
  • 168
hippietrail
  • 15,848
  • 18
  • 99
  • 158
  • I am not sure that you really need be depended on the `/u` switch. If you write yourself the program which produce Unicode output you could have problem to display the data, but you can write the program so that you will write always correct output in the file if output is redirected. If you need I could post you a small code example how I do that. The general idea is to test `(GetFileType (g_hStdOutput) & ~FILE_TYPE_REMOTE) != FILE_TYPE_DISK` to be sure that the output is redirected. Then you should just write UNICODE characters to the file with the handle `GetStdHandle(STD_OUTPUT_HANDLE)`. – Oleg Feb 18 '11 at 13:38
  • I was hoping to mimic the behaviour of the standard Windows commandline tools such as "dir". They always output Unicode to the console but whether they output ANSI or Unicode when redirected to a file depends on the switch. I would like to know at least in theory what those tools do to respect the user's use of the switch. – hippietrail Feb 18 '11 at 14:39
  • @hippietrail: its depend on the implementation of the corresponding tools, but it you don't use the existing utility and write the program yourself you can write in the file (in case of output redirection) **always** Unicode text. Then you will have no problem. By the way if you write comment to his own question please use @Oleg to get me known that the information for me there are. See http://meta.stackoverflow.com/questions/43019/how-do-comment-replies-work. Currently I read your comment randomly. – Oleg Feb 18 '11 at 14:49
  • @Oleg: Assume that I want to make tools for other people to use. They would be surprised to find my tools outputting different formats to the standard tools. I may amend the question. – hippietrail Feb 18 '11 at 15:40
  • @hippietrail: I don't think that the other people will see any distinguishes at all. Let us the tool write redirected output in the Unicode format (UTF-16 with starting 0xFFFE). The most text editors like Notepad.exe open such file written in Unicode automatically and the user just see the text. So most users would just not know whether the file was in ANSI, UTF-16 or UTF-8 format. I used the technique with Unicode redirection since many years in the most console utilities which I wrote and till now I had no time any understanding problems from the users. – Oleg Feb 18 '11 at 18:02
  • @hippietrail: "Tools" like the dir command are interpreted by cmd.exe itself. They aren't external programs like your application. – Peter Ruderman Feb 18 '11 at 20:39
  • @Oleg: Thanks it does seem that there is no way to do output identically to the built-in tools. I am in fact making cross-platform command-line tools and was looking for the cleanest way possible to write to `stdout` that would "just work" the way users expected on all platforms. But perhaps Windows users expectations are not so high as on other systems (-: – hippietrail Feb 19 '11 at 04:41
  • 1
    If you want to provide the ability to output non-ASCII to a non-ASCII file, then you can provide a specific file-saving function, but the onus of redirection is on the user. Your best bet is to simply output the correct text regardless of the code-page and leave it to the user to make sure that they have the correct code-page set. To capture the correct redirected text, they can [set the correct code-page](http://stackoverflow.com/questions/2706097/). – Synetech Aug 26 '13 at 04:15

3 Answers3

1

As far as I know, there's no API you can use to query this.

In order to detect it, I'd suggest a CreateFile on CONOUT$, write a few unicode characters that do not exist in the same codepages (e.g. japanese, chinese, greek), then read the console using ReadConsoleOutputCharacter looking for valid unicode or garbage. And then clean up the output (maybe copy it first, restore afterwards).

This is kind of a kludge, and I haven't implemented it to test, but the theory say it should work.

Erik
  • 88,732
  • 13
  • 198
  • 189
1

Here's the documentation for the /U switch:

/U Causes the output of internal commands to a pipe or file to be Unicode

It seems pretty clear that this switch only affects the behaviour of commands interpreted by cmd.exe itself. Internally, it probably just sets some flag. There's no way to read that value from an external process.

I think your best bet would be to try and retrieve the command line of the launching cmd.exe instance. This is not a recommended pactice, but it should be safe in your case. Here's an article that explains how but please make sure you understand the caveats.

How do I get the command line of another process?

You can use GetConsoleProcessList() to find the hosting cmd.exe process.

Peter Ruderman
  • 12,241
  • 1
  • 36
  • 58
  • Well it looks like this is the right answer. I shall have to take it as a bit more evidence that the console is a second-class citizen even on modern Windows )-: – hippietrail Feb 19 '11 at 04:36
1

The /u option controls the behaviour of the command interpreter, and has nothing to do with the console. This question, and indeed several of the comments to it, are making the classic mistake, made so many times before, of conflating a console with a command interpreter that just happens to be one of the processes using that console. A command interpreter is not a console. Options to make a command interpreter (or indeed any other program) write its output in certain ways have nothing whatsoever to do with the console to which that output may, or may not, be directed.

Indeed, these options aren't even console-related. The /U and /A options to CMD flip an internal toggle within the command interpreter itself, that is checked by built-in commands such as DIR before they write to pipes and to files. The state of the toggle determines how they choose to write their output. One can even see this option in the source code for the Reactos CMD. The toggle is bUnicodeOutput and it's checked by the ConWrite() function.

And this of course indicates how one goes about doing this in one's own program to achieve the same effect. One gives one's program /U and /A (or otherwise named) options, flips a toggle accordingly, and modifies the writing behaviour of one's program when it discovers that it is writing to pipe or file. In other words: One does exactly what CMD does. And the user runs one's program passing it the /U and /A options, just like they do when they use CMD.

In yet other words: An example of this mechanism already exists. Just copy it.

JdeBP
  • 2,127
  • 16
  • 24
  • Hmm this makes sense for other OSes such as Unix but if there is confusion in the case of Windows it seems to be Windows's fault rather than the fault of the users. Is `CMD` a console or a command interpreter? Are their independently replaceable alternatives to each? – hippietrail Mar 01 '11 at 16:31
  • 1
    _If there is confusion in the case of Windows it seems to be Windows's fault rather than the fault of the users._ In this case it's the user's fault -- the users who are **still** asking whether CMD is a console or a command interpreter despite it having been stated that it's a command interpreter three times over. – JdeBP Mar 01 '11 at 17:31