1

Please tell what is the char16_t version for the String Manipulation Functions

such as:

http://www.tutorialspoint.com/ansi_c/c_function_references.htm

I found many references site, but no one mentioned that.

Especially for printing function, this is that most important, because it help me to verify whether the Manipulation function is work.

#include <stdio.h>
#include <uchar.h>



char16_t *u=u"α";
int main(int argc, char *argv[])
{
    printf("%x\n",u[0]); // output 3b1, it is UTF16



    wprintf("%s\n",u); //no ouput
    _cwprintf("%s\n",u); //incorrect output

    return 0;
}
serv-inc
  • 35,772
  • 9
  • 166
  • 188
CL So
  • 3,647
  • 10
  • 51
  • 95
  • 1
    The incorrect output case may be because the terminal does not use the right encoding. Try forwarding output to file and look at its hex dump. – hyde Nov 29 '13 at 19:24
  • @CL So What is the "incorrect output"? – chux - Reinstate Monica Nov 29 '13 at 20:01
  • @hyde, @chux, I don't know how to print the UNICODE text to file, I try this `code FILE *f=fopen(L"e:\\test.txt",L"w");//and _wfopen int i=fwprintf(f,L"abc\n"); printf("%d",i); fclose(f);`, the value of i is 3, and hex of test.txt is "61 62 0D 0A", correct UTF16-LE should be "61 00 62 00 0D 00 0A 00". For the incorrect output, I tried many combination, please this [link](home.netvigator.com/~fhappy/unicode_printf_result.7z), and please see the first screen, you can see the "α" is shown at my windows 7 console, it proof that my console is able to display the UNICODE character – CL So Nov 30 '13 at 19:10

2 Answers2

2

To print/read/open write etc.., you need to convert to 32-bit chars using the mbsrtowcs function.

For ALL intents and purposes, char16_t is a multi-byte representation, therefore, one need use mbr functions to work with this integral type.

A few answers used the L"prefix" which is completely incorrect. 16-bit strings require the u"prefix".

The following code gets you everything you need to work with 8, 16, and 32-bit string representations.

#include <string.h>
#include <wchar.h>
#include <uchar.h>

You can Google the procedures found in <wchar.h> if you don't have manual pages (UNIX).

Gnome.org's GLib has some great code for you to drop-in if overhead isn't an issue.

char16_t and char32_t are ISO C11 (iso9899:2011) extensions.

1

wprintf and its wchar colleagues need to have th format string in wchar too: wprintf( L"%s\n", u);

For wchar L is used as a prefix to the string literals.

Edit:

Here's a code snippet (tested on Windows):

#include <stdio.h>
#include <io.h>
#include <fcntl.h>
#include <wchar.h>

void main()
{
    wchar_t* a = L"α";
    fflush(stdout); //must be done before _setmode
    _setmode(_fileno(stdout), _O_U16TEXT); // set console mode to unicode
    wprintf(L"alpha is:\n\t%s\n", a);      // works for me :)
}

The console doesn't work in unicode and prints a "?" for non ascii chars. In Linux you need to remove the underscore prefix before setmode and fileno.

Note: for windows GUI prints, there already proper support, so you can use wsprintf to format unicode strings.

egur
  • 7,830
  • 2
  • 27
  • 47
  • [link](http://home.netvigator.com/~fhappy/unicode_printf_result.7z). Please see my testing result, I tried u prefix, L prefix, in different function. – CL So Nov 30 '13 at 19:15
  • 1
    See updated answer. It now works for me. Alpha was created via the character map application. – egur Nov 30 '13 at 20:43
  • 1
    My compiler(Pelles C) gave me "error #2048: Undeclared identifier '_O_U16TEXT'." , so could you tell me what value in your fcntl.h is? because I cannot found it in my fcntl.h – CL So Dec 01 '13 at 19:15
  • 1
    @CLSo `#define _O_U16TEXT 0x20000` – egur Dec 01 '13 at 21:26
  • 1
    This answer is for `wchar_t` but not `char16_t` as in the title. – chux - Reinstate Monica Feb 25 '16 at 21:57