Discrepancy with fgetc while reading a text file

Question

I´m beginning with C and I´m willing to understand certain conditions.

I have a text file, generated by notepad or direct via shell by echo in a windows os.

When running this the output show extra chars. What I ´m doing wrong? How I can read text files in a secure way char by char?

Using codeblocks with minggw.

file.txt:

TEST

C program

void main()
{
   int i;
   FILE *fp;

   fp = fopen("file.txt","r");

   while ((i = fgetc(fp)) != EOF)
   {
      printf("%c",i);
   }
}

Output

■T E S T

while should be `while ((c = fgetc(fp)) != EOF)` – Grijesh Chauhan Oct 23 '13 at 12:10 — Grijesh Chauhan, Oct 23 '13 at 12:10
it´s fp, I forgot to translate it, now it´s fine.. ty ! – Guilherme Viebig Oct 23 '13 at 12:40 — Guilherme Viebig, Oct 23 '13 at 12:40

unwind · Accepted Answer · 2013-10-23T12:58:42.507

3

Your code has issues, but the result is fine.

Your file is likely UTF-8 with a (confusingly enough) byte order mark in the beginning. Your program is (correctly) reading and printing the bytes of the BOM, which then appear in the output as strange characters before the proper text.

Of course, UTF-8 should never need a byte order mark (it's 8-bit bytes!), but that doesn't prevent some less clued-in programs from incuding one. Window's Notepad is the first program on the list of such programs.

UPDATE: I didn't consider the spacing between your letters, which of course indicate 16-bit input. That's your problem right there, then. Your C code is not reading wide characters.

edited Oct 23 '13 at 12:58

answered Oct 23 '13 at 12:15

unwind

391,730
64
469
606

Well, echo also makes the files looks like that. Notepad++ wrote a file that was ok, but when openning the "bad" one it saves it in the same bad way. That´s because the encoding is UCS-2 Little indian.. When I switch to ANSI it runs OK, when UTF8 another chars appear on the beginning. – Guilherme Viebig Oct 23 '13 at 12:57
Any thoughts on using fgect to read this encoded files in the less effort possible way? Any implementation or library that takes care of doing that? Thank you! – Guilherme Viebig Oct 23 '13 at 13:01

score 0 · Answer 2 · answered Oct 23 '13 at 12:23

0

Try this code

void main()
{
   int c,i;
   FILE *fp;

   fp = fopen("file.txt","r");

   while ((i = fgetc(fp)) != EOF)
   {
     printf("%c",i);
   }
}'

answered Oct 23 '13 at 12:23

sandeep upadhyay

137
2

this is the same code I posted corrected :> Output is the same, I think there is a issue with encoding – Guilherme Viebig Oct 23 '13 at 12:48

Discrepancy with fgetc while reading a text file

2 Answers2