5

Input ->

Output-> 😂😂

I simply want to maintain the original state of the emoji.

All i am doing is this

#include <stdio.h>
#include <stdlib.h>

int main()
{
    char ch;
    FILE *fp;

    fp = fopen("test.txt","r");
    while( ( ch = fgetc(fp) ) != EOF )
        printf("%c",ch);

    fclose(fp);
return 0;
}
karx
  • 557
  • 2
  • 6
  • 16
  • 1
    An answer depends on how that file is being written - the characters you used for example input are showing up as `😂` if I view the page source for your question. What kind of encoding do you expect to handle - you tagged with "unicode," but which? UTF-8? -16? -32? UCS-2? (I hope not). – frasnian Dec 23 '14 at 22:45
  • 1
    Yes, the fact that the "Input" in your post doesn't show up correctly should be considered your introduction to the world of Unicode, where you can no longer assume that a byte is a character, and, depending on the platform, you probably need to either handle UTF-8, or use something other than old-style 8-bit POSIX functions to handle UTF-16. And you need to learn about byte order and byte order marks in files. – Dan Korn Dec 23 '14 at 23:04
  • 2
    The file was encoded in utf-8, not code-page 437, you'll have to tackle that first. With the non-standard "rt,ccs=UTF-8" as fopen's 2nd argument for example. Then you'll have to write a GUI app to display it correctly, the olden C runtime teletype user interface can only display smiles side-ways :-p To get colors google "emoji directwrite 8.1" – Hans Passant Dec 23 '14 at 23:19
  • The code sample strongly indicates that this is C code, not C++. Which language do you expect an answer in? –  Dec 24 '14 at 05:48
  • How do you view input and output? – n. m. could be an AI Dec 24 '14 at 05:52
  • Also note that fgetc returns an int, EOF is an int, and ch need be an int too. – n. m. could be an AI Dec 24 '14 at 05:54
  • 1
    [See how a program similar to yours runs as intended](http://ideone.com/qQn1i7). – n. m. could be an AI Dec 24 '14 at 06:03
  • The language is not at all an issue, I used C because I was more comfortable in this way of handling files. The output currently is just in the standard output which I normally pipe to a file. – karx Dec 24 '14 at 09:54
  • @n.m. Funny thing. My code works perfectly well as you did on ideone. But when i run the same thing on my desktop(I use codeblocks) i am getting ???? instead of emoji. Thanks though, got my work done for now. But I still dont understand why.. – karx Dec 24 '14 at 10:06
  • That's why I'm asking how you view input and outpit. If you view the input in a GUI editor and the output in the console, you are comparing apples and oranges. – n. m. could be an AI Dec 24 '14 at 10:18

1 Answers1

1

In Unicode encoding, emoji must take more than one bytes. Hence printing byte by byte will not help in this case. If you redirect the output to a file, you may get almost same as your file.

You may try to print the string by changing locale(on Linux) or you can try wprintf on Windows (remember to convert to Wide string).

doptimusprime
  • 9,115
  • 6
  • 52
  • 90