0

This link tells about why feof() is a bad thing to use as an exit indicator for a loop.

Unsafe ==> having an feof() check in the while, and an fgets() inside the while.

Safe ==> having the fgets()!=NULL check in the while itself.

I'm supposed to see the unsafe code doing an extra while loop iteration, but both do the same(and correct) number of loops. Can someone help me understand what's happening here ?

EDIT : The link actually did say why this is happening, but it took for the correct answer below for me to understand exactly what i was reading. My file did not have a '\n' at the last line, so got same results.

This is the file contents :

abcd
efgh
ijkl

And Here's code :

void testUnsafe(void) {
    FILE *f;
    char buf[20];
    f = fopen("fil.txt", "r");
    while (!feof(f)) {
        fgets(buf, 20, f);
        if (buf[strlen(buf) - 1] == '\n') //cleaner
            buf[strlen(buf) - 1] = '\0';
        printf("%s , %d\n", buf, strlen(buf));
    }
    fclose(f);
}

void testSafe(void) {
    FILE *f;
    char buf[20];
    f = fopen("fil.txt", "r");
    while (fgets(buf, 20, f) != NULL) {
        if (buf[strlen(buf) - 1] == '\n') //cleaner
            buf[strlen(buf) - 1] = '\0';
        printf("%s , %d\n", buf, strlen(buf));
    }
    fclose(f);
}

Output is :

******unsafe test********
abcd , 4
efgh , 4
ijkl , 4
********safe test********
abcd , 4
efgh , 4
ijkl , 4
Somjit
  • 2,503
  • 5
  • 33
  • 60
  • Try putting your file on a netwrok drive, make some kind of pause in the loop, and disconnect the network cable while reading the file. – pmg Dec 13 '14 at 20:31
  • Sorry, as you can see , i'm not that much good at C to understand the problem here, if, that is, this wasn't sarcasm to begin with. – Somjit Dec 13 '14 at 20:48

3 Answers3

3

If your text file ends without a newline after the last line of text, the testUnsafe() function will reach end-of-file when it reads the last line, and produce the three lines of output you have shown.

If your text file does have a newline after the last line of text, the function will read the last line, including the newline, without reaching end-of-file. When it enters the while() loop again, it reads zero characters, sets the end-of-file flag, and outputs the last line which is still in the buffer from the last round.

The while (!feof(f)) construction is not unsafe in itself. It's neglecting to check the return value of fgets() that is unsafe.

Nisse Engström
  • 4,738
  • 23
  • 27
  • 42
2

I tried your two examples and got different results to yours. Function testUnsafe() printed the last line of my file twice. There are two reasons for this.

  1. The feof() function returns a nonzero value if a read operation has attempted to read past the end of the file.

  2. Function testUnsafe() does not check the return value of fgets() and so repeats the previously read string before hitting the feof() condition.

I copied your functions into my test program

#include <stdio.h>
#include <string.h>

void testUnsafe(void) {
    FILE *f;
    char buf[20];
    f = fopen("fil.txt", "r");
    while (!feof(f)) {
        fgets(buf, 20, f);
        if (buf[strlen(buf) - 1] == '\n') //cleaner
            buf[strlen(buf) - 1] = '\0';
        printf("%s , %d\n", buf, strlen(buf));
    }
    fclose(f);
}

void testSafe(void) {
    FILE *f;
    char buf[20];
    f = fopen("fil.txt", "r");
    while (fgets(buf, 20, f) != NULL) {
        if (buf[strlen(buf) - 1] == '\n') //cleaner
            buf[strlen(buf) - 1] = '\0';
        printf("%s , %d\n", buf, strlen(buf));
    }
    fclose(f);
}

int main()
{
    testUnsafe();
    printf ("\n\n");
    testSafe();
    return 0;
}

Test file:

Line 1
Line 2
Line 3

Output of testUnsafe():

Line 1 , 6
Line 2 , 6
Line 3 , 6
Line 3 , 6

Output of testSafe():

Line 1 , 6
Line 2 , 6
Line 3 , 6
Weather Vane
  • 33,872
  • 7
  • 36
  • 56
  • Following @NisseEngström comment, that's why it is unsafe. – Weather Vane Dec 13 '14 at 20:31
  • With no `newline` after the last line, the correct 3 lines were printed. – Weather Vane Dec 13 '14 at 20:43
  • 1
    In select situations (embedded null character), `strlen(buf)` is 0, thus `if (buf[strlen(buf) - 1]` is UB. – chux - Reinstate Monica Dec 14 '14 at 00:27
  • @chux : can you give a link where i can learn (easily) about embedded nulls ? my google results are turning out a bit too hard for me. – Somjit Dec 14 '14 at 15:23
  • @chux means a line such as `"\0some text\n"` where `fgets()` will read up to and including the `newline`, but after reading it the string functions will see an empty string. I noticed the potential flaw, but thought that in this example it can't happen, as even a blank line will have a `newline`, and if the final text line lacks a `newline` it must have at least one other character. But as @chux says, it can. – Weather Vane Dec 14 '14 at 17:36
  • @Somjit (sorry no Google link comes to mind.) Weather Vane `testSafe()` protects against another problem `testUnsafe()` does not. Since `testUnsafe()` does not test the result of `fgets()`, on a rare communication error, `fgets()` returns `NULL` and the contents of `buf` could be _anything_. Using `buf` without checking `fgets()` results is a gold mine for hackers. Best to practice safe user IO - use protection. – chux - Reinstate Monica Dec 14 '14 at 19:18
0

Basically, to read all your lines you must use algo like that. With ou without newline at end of file,you are sure to load all lines.

The exception here is the last line are not sure to have a LF at the end.

Except thing, like checking buffer overflow, to optimize the memory usage, you can also call realloc() to trim the buffer before adding it in a array.

buffer = (char*)malloc(bufferSize);
while(fgets(buffer, bufferSize, file) != NULL) {
    //here store your pointer in array...
    buffer = (char*)malloc(bufferSize);
};
free(buffer);