0

I've come across such an example of getword. I understand all the checks and etc. but I have a problem with ungetc.

When the c does satisfy if ((!isalpha(c)) || c == EOF)and also doesn't satisfy while (isalnum(c)) -> it isn't a letter, nor a number - ungetc rejects that char.

Let's suppose it is '\n'.

Then it gets to return word however it can't be returned since it is not saved in any array. What happens then?

    while (isalnum(c)) {
        if (cur >= size) {
            size += buf;
            word = realloc(word, sizeof(char) * size);

        }
        word[cur] = c;
        cur++;
        c = fgetc(fp);
    }
    if ((!isalpha(c)) || c == EOF) {
        ungetc(c, fp);          
    }
    return word;

EDIT @Mark Byers - thanks, but that c was rejected for a purpose, and will not satisfy the condition again and again in an infinite loop?

Peter Cerba
  • 806
  • 4
  • 14
  • 26
  • 1
    This is a `getword` function, not a `getwordahdonemorecharacterwhateveritmaybe`. It reads until it encounters a character that isn't alphanumeric. It then puts that character back into the stream and returns. Presumably it returns a character pointer, but you omitted the function declaration so I'm not 100% sure. The `!isalpha(c)` in the if statement after the loop is equivalent to true, because the character will never be alphabetic (if it was, the loop wouldn't have broken). Unless the loop is capable of breaking during error handling. – Wug Aug 28 '12 at 20:13

3 Answers3

1

ungetc pushes the characters onto the stream so that the next read will return that character again.

ungetc(c, fp);  /* Push the character c onto the stream. */
/* ...etc... */
c = fgetc(fp);  /* Reads the same value again. */

This can sometimes be convenient if you are reading characters to find out when the current token is complete, but aren't yet ready to read the next token.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • @PeterKowalski: There is no infinite loop in the code you showed. If there is an infinite loop then it probably a bug in the code that is calling this function. – Mark Byers Aug 28 '12 at 20:24
  • program is working correctly, I just imagine any input, and procedure invoked when it is read. – Peter Cerba Aug 28 '12 at 20:27
1

The terminal condition, just before the line you don't understand, is not good. It should probably be:

int c;

...

if (!isalpha(c) && c != EOF)
    ungetc(c, fp);

This means that if the last character read was a real character (not EOF) and wasn't an alphabetic character, push it back for reprocessing by whatever next uses the input stream fp. That is, suppose you read a blank; the blank will terminate the loop and the blank will be pushed back so that the next getc(fp) will read the blank again (as would fscanf() or fread() or any other read operation on the file stream fp). If, instead of blank, you got EOF, then there is no attempt to push back the EOF in my revised code; in the original code, the EOF would be pushed back.

Note that c must be an int rather than a char.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • I got it defined as `int`. Please see my edit, because in my opinion reading eg. `'\n'` would end up in an infinite loop. – Peter Cerba Aug 28 '12 at 20:25
  • @PeterKowalski: Why? On the first iteration, if it reads newline as the first character, the loop terminates, the newline is pushed back, the word is empty, and the code (not shown) that deals with the non-word part of the code (perhaps the code that skips white space?) will deal with it. If it is the second or subsequent character, then the loop terminates and the word is not empty, the newline is pushed back for the non-word part of the code to deal with. Note that I (had to) hypothesize code to read the non-word characters. —— And I see from your answer that you worked this out too. – Jonathan Leffler Aug 28 '12 at 21:05
0

OK. Now I understand why that case with eg. '\n' was troubling me. I'm just dumb and forgot about the section in main() referring to getword. Of course before calling getword there are a couple of tests (another ungetc there) and it fputs that characters not satisying isalnum It emerges from this that while loop in getword always starts with at least one isalnum positive, and the check at then end is just for following characters.

Peter Cerba
  • 806
  • 4
  • 14
  • 26