0

I have a program that reads a file into a buffer structure. The problem I'm having is that when I look at the output of the file, there's an extra EOF character at the end. Ill post the related functions:(NOTE: I removed parameter checks and only posted code in the function related to the issue)

b_load

int b_load(FILE * const fi, Buffer * const pBD){
    unsigned char character; /*Variable to hold read character from file*/
    Buffer * tempBuffer; /*Temparary Bufer * to prevent descruction of main Buffer*/
    short num_chars = 0; /*Counter of the amount of characters read into the buffer*/

    /*Assigns main Buffer to tempBuffer*/
    tempBuffer = pBD;

    /*Infinite loop that breaks after EOF is read*/
    while(1){
        /*calls fgetc() and returns the char into the character variable*/
        character = (unsigned char)fgetc(fi);

        if(!feof(fi)){
            tempBuffer = b_addc(pBD,character);

            if(tempBuffer == NULL)
                return LOAD_FAIL;
            ++num_chars;
        }else{  
            break;
        }
    }
    return num_chars;
}    

b_print

int b_print(Buffer * const pBD){
    int num_chars = 0;

    if(pBD->addc_offset == 0)
        printf("The buffer is empty\n");
    /*Sets getc_offset to 0*/
    b_set_getc_offset(pBD, 0);

    pBD->eob=0;

    /*b_eob returns the structures eob field*/
    while (!b_eob(pBD)){
        printf("%c",b_getc(pBD));
        ++num_chars;
    }
    printf("\n");

    return num_chars;
}

b_getc

char b_getc(Buffer * const pBD){
    if(pBD->getc_offset  == pBD->addc_offset){
        pBD->eob = 1;
        return R_FAIL_1;
    }   
    pBD->eob = 0;
    return pBD->ca_head[(pBD->getc_offset)++];
}

at the end I end up with:

"a catÿ" (the y is the EOF character)

It prints an EOF character but is never added to the buffer. When the driver code adds an EOF character to the end of the buffer, 2 appear. Any idea what is causing this? I might be using feof() wrong so that may be it, but it is required in the code

Josip Dorvak
  • 121
  • 1
  • 3
  • 13

2 Answers2

6

There is no "EOF character". EOF is a value returned by getchar() and related functions to indicate that they have no more input to read. It's a macro that expands to a negative integer constant expression, typically (-1).

(For Windows text files, an end-of-file condition may be triggered by a Control-Z character in a file. If you read such a file in text mode, you won't see that character; it will just act like it reached the end of the file at that point.)

Don't use the feof() function to detect that there's no more input to read. Instead, look at the value returned by whatever input function you're using. Different input functions use different ways to indicate that they weren't able to read anything; read the documentation for whichever one you're using. For example, fgets() returns a null pointer, getchar() returns EOF, and scanf() returns the number of items it was able to read.

getchar(), for example, returns either the character it just read (treated as an unsigned char and converted to int) or the value EOF to indicate that it wasn't able to read anything. The negative value of EOF is chosen specifically to avoid colliding with any valid value of type unsigned char. Which means you need to store the value returned by getchar() in an int object; if you store it in a char or unsigned char instead, you can lose information, and an actual character with the value 0xff can be mistaken for EOF.

The feof() function returns the value of the end-of-file indicator for the file you're reading from. That indicator becomes true after you've tried and failed to read from the file. And if you ran out of input because of an error, rather than because of an end-of-file condition, feof() will never become true.

You can use feof() and/or ferror() to determine why there was no more input to be read, but only after you've detected it by other means.

Recommended reading: Section 12 of the comp.lang.c FAQ, which covers stdio. (And the rest of it.)

UPDATE :

I haven't seen enough of your code to understand what you're doing with the Buffer objects. Your input look actually looks (almost) correct, though it's written in a clumsy way.

The usual idiom for reading characters from a file is:

int c;   /* `int`, NOT `char` or `unsigned char` */
while ((c = fgetc(fi)) != EOF) {
    /* process character in `c` */
}

But your approach, which I might rearrange like this:

while (1) {
    c = fgetc(fi);
    if (feof(fi) || ferror(fi)) {
        /* no more input */
        break;
    }
    /* process character in c */
}

should actually work. Note that I've added a check for ferror(f1). Could it be that you have an error on input (which you're not detecting)? That would cause c to contain EOF, or the value of EOF converted to the type of c. That's doubtful, though, since it would probably give you an infinite loop.

Suggested approach: Using either an interactive debugger or added printf calls, show the value of character every time through the loop. If your input loop is working correctly, then build a stripped-down version of your program with a hard-wired sequence of calls to b_addc(), and see if you can reproduce the problem that way.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • I'll look into that thanks. It's not my decision to use `feof()`. – Josip Dorvak Sep 25 '13 at 20:19
  • Actually, the stdio portion of his code will work fine, as he checks `feof` before using the character it read. The problem is that `b_getc` acts the same way but is *not* being checked properly. So the data read in is fine, but the output is not. – ughoavgfhw Sep 25 '13 at 20:21
  • for `b_getc` I'm not supposed to discard EOF characters. After the text is printed out a first time (with no EOF at the end), the main code appends an EOF at the end which is supposed to get displayed. What's happening though is an extra EOF appears at the end in both cases even though it never gets read into the buffer. – Josip Dorvak Sep 25 '13 at 20:27
  • output should look like this: Printing buffer contents: a cat Printing buffer contents: a catÿ. In my case another ÿ is added – Josip Dorvak Sep 25 '13 at 20:29
  • Also problem doesn't revolve around `b_getc`. Tested only returning the character 'a' and it still adds an EOF – Josip Dorvak Sep 25 '13 at 20:37
  • @JosipDorvak: change the `unsigned char character; ` to `int character;` (and remove the cast and the silly feof() – wildplasser Sep 25 '13 at 20:57
  • @wildplasser I wish I was "allowed" to remove the feof(). And the other stuff is there for a reason – Josip Dorvak Sep 25 '13 at 20:59
  • You are allowed to remome it. By me. And by @Keith Thompson. – wildplasser Sep 25 '13 at 21:00
  • 1
    @JosipDorvak: Remember this: `EOF` *is not a character*. It's a negative integer value, distinct from any character value, returned by `fgetc()` to indicate that no character was available. If you're required to store an `EOF` in a buffer, then there is something wrong with your requirements. Where exactly does this requirement to use `feof()` come from? – Keith Thompson Sep 25 '13 at 21:03
  • int b_load (FILE * const fi, Buffer * const pBD) The function loads (reads) an open input file specified by fi into a buffer specified by pB. The function must use the standard function fgetc(fi) to read one character at a time and the function ca_addc() to add the character to the buffer. If the current character cannot be put in the buffer, the function must return -2 (use the defined LOAD_FAIL constant). The operation is repeated until the standard macro feof(fi) detects end-of-file on the input file. The end-of-file character must not be added to the content of the buffer. – Josip Dorvak Sep 25 '13 at 21:04
  • @JosipDorvak: Ok, that's what the requirement says. *Why* is someone insisting using `feof(fi)` (which is a function call, not necessarily a macro) to detect end of input? It's *not* the best way to do this. – Keith Thompson Sep 25 '13 at 21:14
  • @KeithThompson beats me. I know it's a horrible way of checking. Side note, still adds it at the end. Thanks for all the help though I'll spend the next few hours with pen and paper and see if I can find anything I missed. Sucks it has to be a perfect byte by byte comparison. – Josip Dorvak Sep 25 '13 at 21:21
  • @JosipDorvak: Ok, where does this requirement come from? Is this a school assignment? Can you ask the person who told you to do this? – Keith Thompson Sep 25 '13 at 21:35
  • @KeithThompson it is a school assignment. It's part of a compiler. I did ask and he was very vague in his answer and just mentioned to look at b_load. – Josip Dorvak Sep 25 '13 at 21:38
  • I finished helping my friend with this same homework assignment. I will say that after I read @KeithThompson answer I had found what causes the extra EOF. The EOF VALUE is not caused by the profs code, but it is situation that you have not taken into account in your code and handled correctly. – Jake88 Sep 22 '14 at 17:14
-1

There you go ...

int b_load(FILE * const fi, Buffer * const pBD){
    int character; /*Variable to hold read character from file*/
    Buffer * tempBuffer; /*Temparary Bufer * to prevent descruction of main Buffer*/
    short num_chars ; /*Counter of the amount of characters read into the buffer*/


    /*Infinite loop that breaks WHEN EOF is read*/
    while(num_chars = 0; 1; num_chars++ ) {

        character = fgetc(fi);
        if (character == EOF || feof(fi)) break; // since you insist on the silly feof() ...

        tempBuffer = b_addc(pBD, (unsigned char) character);
        if(tempBuffer == NULL) return LOAD_FAIL;
        }
    }
    return num_chars;
}    
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • Doesn't matter. I know what he tried to do. Although it doesn't work, this problems just getting really annoying now – Josip Dorvak Sep 25 '13 at 23:29
  • @EJP: it was explained enough. The OP is stubborn enough to insiston his `unsigned char character' and his `( ! feof()) {} ` – wildplasser Sep 26 '13 at 07:46