5

I would like to have this code:

System.Console.Out.WriteLine ("œil");

display œil instead of oil as it does in my test program.

The Console.OutputEncoding is set by default to Western European (DOS) (CodePage set to 850 and WindowsCodePage set to 1252) on my system. The character set contains the special OE and oe diphtongs (as can be seen on the Wikipedia article on Windows-1252) but somehow, I suspect that the characters not found in the ISO-8859-1 set get discarded/replaced.

Characters such as â, ç, etc. get properly displayed on the console, but any character in the extended 0x80 ... 0x9F range are not.

How can I properly display them on the console?

Pierre Arnaud
  • 10,212
  • 11
  • 77
  • 108
  • 2
    Very unusual to get code page 1252 on the console, it is almost always 437, the IBM PC OEM code page. Which doesn't have a glyph for \u0153. You can change the OutputEncoding but then you'll also have to change the console font. – Hans Passant Oct 28 '11 at 13:35
  • Incidentally, "Western European (DOS)" is codepage 850, not 1252. It contains all of the characters from ISO 8859-1 (i.e. windows 1252's 0xA0-0xFF) which may have caused some confusion. The .NET function themselves use Unicode, but get translated (sometimes poorly) when written to the console with a non-truetype font selected. – Random832 Oct 28 '11 at 13:40
  • I've clarified the code page issues; the `System.Text.SBCSCodePageEncoding` returns both a codepage and a Windows codepage and I confused both. Sorry. – Pierre Arnaud Oct 29 '11 at 14:41

2 Answers2

6

Like this:

Console.OutputEncoding = System.Text.Encoding.UTF8;
System.Console.Out.WriteLine("œil");

Don't forget to select a font for your console window that supports the characters you need. This is a screen shot of my console window using the Consolas font.

enter image description here

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • Indeed, this does not work. On my Windows 7 x64, executing your code produces a cross followed by characters `ôil`. – Pierre Arnaud Oct 28 '11 at 13:24
  • 2
    Works fine for me. What font do you have in your console Window? – David Heffernan Oct 28 '11 at 13:26
  • Be warned that Lucida Console has a lot of characters wrong. Use something else, for example Consolas. – ggPeti Oct 28 '11 at 13:28
  • See http://stackoverflow.com/questions/1802600/changing-font-in-a-console-window-in-net on how to change the font of a console in C# using interop. – Pierre-Alain Vigeant Oct 28 '11 at 13:30
  • The secret is indeed in the font... Using Consolas as the font makes the UTF8 output display properly, just like what David posted in his screenshot. Thanks a lot. – Pierre Arnaud Oct 29 '11 at 14:30
  • And by the way, here is a related question: http://stackoverflow.com/questions/7939643/how-do-i-read-special-characters-0x80-0x9f-from-the-console-in-c (how do I read special characters back) – Pierre Arnaud Oct 29 '11 at 14:55
2

You could set the output encoding on the console like this...

Console.OutputEncoding = Encoding.UTF8;
Prashanth
  • 2,404
  • 1
  • 17
  • 19