1

I got some local language font installed in my system (windows 8 OS). Through character map tool in windows, i got to know the unicode for those characters for that particular font. All i wanted to print those character in command line through a C program.

For example: Assume greek letter alpha is represented with unicode u+0074.

Taking "u+0074" as an input, i would like my C program to output alpha character

Can anyone help me?

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
Learner
  • 962
  • 4
  • 15
  • 30

4 Answers4

1

use the Unicode version of the WriteConsole function.

also, be sure to store the source code as UTF-8 with BOM, which is supported by both g++ and visual c++


Example, assuming that you want to present a greek alpha given its Unicode code in the form "u+03B1" (the code you listed stands for a lowercase "t"):

#include <stdlib.h>         // exit, EXIT_FAILURE, wcstol
#include <string>           // std::wstring
using namespace std;

#undef UNICODE
#define UNICODE
#include <windows.h>

bool error( char const s[] )
{
    ::FatalAppExitA( 0, s );
    exit( EXIT_FAILURE );
}

namespace stream_handle {
    HANDLE const output     = ::GetStdHandle( STD_OUTPUT_HANDLE );
}  // namespace stream_handle

void write( wchar_t const* const s, int const n )
{
    DWORD n_chars_written;
    ::WriteConsole(
        stream_handle::output,
        s,
        n,
        &n_chars_written,
        nullptr         // overlapped i/o structure
        )
        || error( "WriteConsole failed" );
}

int main()
{
    wchar_t const input[]    = L"u+03B1";
    wchar_t const ch        = wcstol( input + 2, nullptr, 16 );
    wstring const s         = wstring() + ch + L"\r\n";

    write( s.c_str(), s.length() );
}
Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • And if he's using `WriteConsole`, g++ compatibility is rather irrelevant. As far as I know, no Unix platform supports `WriteConsole`. But of course, he can output directly to `std::cout` or `std::wcout`, as long as he sets the appropriate code page in the console. (I tend to use code page 65001, mostly.) – James Kanze Feb 19 '13 at 10:23
  • @James: re "ASCII for the source code", that's needlessly restricted to the English alphabet. some years ago you discussed how you and your co-workers wrote code in a non-English national language. the current advice is inconsistent with that. – Cheers and hth. - Alf Feb 19 '13 at 10:26
  • @James: g++ is available on the Windows platform, and is the main "other" compiler. i suggest using [STL's distro](http://nuwen.net/mingw.html). there are also many others. – Cheers and hth. - Alf Feb 19 '13 at 10:26
  • My co-workers used adaptations so that they only used pure ASCII: dropping the accents in French (which is a pain), or using e.g. ae insteamd of ä in German. Using full Unicode doesn't work for program text, because there are so many characters which aren't visually distinguishable. (Is that a capital Latin A, or a capital Greek alpha?) – James Kanze Feb 19 '13 at 10:35
  • And while g++ is available for Windows, there's really no reason to use it there; Visual Studios is the standard compiler for that platform. – James Kanze Feb 19 '13 at 10:36
  • @james: it's a good idea to use at least two compilers to ensure portability of code. but of course we're then into judgment and authority arguments. on my side i have me, herb sutter and scott meyers. – Cheers and hth. - Alf Feb 19 '13 at 10:37
  • I always use two compilers too. One under Windows, and another under Linux or Solaris. If you're not concerned about portability, then just Visual C++ is fine. And if you are, you have at least two OS's available. – James Kanze Feb 19 '13 at 10:40
1

There are several issues. If you're running in a console window, I'd convert the code to UTF-8, and set the code page for the window to 65001. Alternatively, you can use wchar_t (which is UTF-16 on Windows), output via std::wostream and set the code page to 1200. (According the the documentation I've found, at least. I've no experience with this, because my code has had to be portable, and on the other platforms I've worked on, wchar_t has been either some private 32 bit encoding, or UTF-32.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • -1 the suggestions don't work. mostly the sketched utf-8 solution is impractical. that's both because visual c++ stores narrow string constants as Windows ANSI, and because tools such as `more` fail with codepage 65001. it can apparently work if visual c++ is tricked into treating utf-8 source code as ansi, but that has other problems (of course). codepage 1200 is documented as only being available to managed application, and does not work in Windows 7 for output via the standard output stream at the API level. – Cheers and hth. - Alf Feb 19 '13 at 10:24
  • @Alf It works for me (at least the UTF-8 suggestion). I've used it a lot. – James Kanze Feb 19 '13 at 10:31
  • @Cheersandhth.-Alf +1. 65001 + UTF-8 is a practical solution, who cares about `more` if the text is just plain wrong or non-existent? The source code will probably have to encode non-ASCII chars using `\xNN` character codes to construct proper UTF-8 code points (or UTF-16, subject to further conversion). Also, they can be generated or loaded from somewhere. – Alexey Frunze Feb 19 '13 at 10:33
  • @James: I suspect that if it appears to work with visual c++, then you're incorrectly feeding visual c++ with utf-8 source code without BOM. – Cheers and hth. - Alf Feb 19 '13 at 10:33
  • @Alf I'm feeding Visual C++ straight ASCII (at least most of the time---in a distant past, I experimented with using accented characters in comments, went nuts with the code page issues, and dropped it). – James Kanze Feb 19 '13 at 10:38
  • @Alf And you seem to have misunderstood the question. He's writing a program which reads ASCII text in the form of `"u+0074"`, and wants to output the corresponding character. The only place Unicode appears is in his output. – James Kanze Feb 19 '13 at 10:41
  • @James: it's unclear what you think i have misunderstood. instead of me doing the work of *guessing* what this answer's vague description means, which could lead to an infinite regress, how about you providing a **concrete example**. – Cheers and hth. - Alf Feb 19 '13 at 11:15
1

First you should set TrueType font (Consolas) in console's Properties. Then this code should suffice in your case -

#include <stdio.h>
#include <tchar.h>

#include <iostream>
#include <string>
#include <Windows.h>
#include <fstream>

//for _setmode()
#include <io.h>
#include <fcntl.h>
using namespace std;

int _tmain(int argc, _TCHAR* argv[])
{
    TCHAR tch[1];
    tch[0] = 0x03B1; 

    // Test1 - WriteConsole
    HANDLE hStdOut = GetStdHandle(STD_OUTPUT_HANDLE);
    if (hStdOut == INVALID_HANDLE_VALUE) return 1;
    DWORD dwBytesWritten;
    WriteConsole(hStdOut, tch, (DWORD)_tcslen(tch), &dwBytesWritten, NULL);
    WriteConsole(hStdOut, L"\n", 1, &dwBytesWritten, NULL);

    _setmode(_fileno(stdout), _O_U16TEXT);

    // Test2 - wprintf
    _tprintf_s(_T("%s\n"),tch);
    // Test3 - wcout
    wcout << tch << endl;

    wprintf(L"\x03B1\n");

    if (wcout.bad())
    {
        _tprintf_s(_T("\nError in wcout\n"));
        return 1;
    }
    return 0;
}

MSDN -

setmode is typically used to modify the default translation mode of stdin and stdout, but you can use it on any file. If you apply _setmode to the file descriptor for a stream, call _setmode before performing any input or output operations on the stream.

SChepurin
  • 1,814
  • 25
  • 17
  • as i recall not supported by g++ – Cheers and hth. - Alf Feb 19 '13 at 11:17
  • @ Cheers and hth. - Alf - Yes. But he mentioned Windows 8. I added some alternatives. – SChepurin Feb 19 '13 at 11:24
  • @SChepuring: you have made it semi-portable to **Windows 9x**, but with gibberish result on that platform. it does not compile with g++ 4.7.2, which reports "foo.cpp:26:31: error: '_O_U16TEXT' was not declared in this scope". i.e., alas, the porting effort was in the wrong direction. – Cheers and hth. - Alf Feb 19 '13 at 11:27
  • @ Cheers and hth. - Alf - Most of console related Unicode output is not portable on Windows. – SChepurin Feb 19 '13 at 11:29
  • re "Most of console related Unicode output is not portable on Windows", happily that's wrong. for example, the code in my answer works fine with g++. – Cheers and hth. - Alf Feb 19 '13 at 11:29
  • @ Cheers and hth. - Alf - I added WriteConsole sample also. Didn't check it using g++. This compiler i use on Linux. – SChepurin Feb 19 '13 at 11:33
  • oh well. some comments so that you can correct the source. first, `tch[0] = 0x03B1` wraps the value when you build as ANSI, rendering all the T stuff irrelevant. next, `_tcslen(tch)` has Undefined Behavior here because the second character of `tch` is indeterminate; you forgot to both to allocate and initialize it. The `wcout<<` call is UB for the same reason. Third, instead of `wcout.bad()` you should check for general failure, `wcout.fail()`. with these corrections i think the code would be technically okay as Visual C++ code. but there's really no need to get compiler-specific here... – Cheers and hth. - Alf Feb 19 '13 at 11:54
  • @ Cheers and hth. - Alf - Unbelievable! Probably, you could leave something for OP to work with. I adapted this sample from the other ones and didn't tune it. But, sure, it does the work. – SChepurin Feb 19 '13 at 14:37
0

In C there is the primitive type of wchar_t which defines a wide-character. There are also corresponding functions like strcat -> wstrcat. Of course it depends on the environment you are using. If you use Visual Studio have a look here.

bash.d
  • 13,029
  • 3
  • 29
  • 42