3

I'm writing a function that counts the number of words in a file. Words may be separated by any amount of whitespace characters. There can be integers in a file, but the program should only count words which have at least one alphabetic character.

int word_count(const char *filename)
{
    int ch;
    int state;
    int count = 0;
    FILE *fileHandle;
    if ((fileHandle = fopen(filename, "r")) == NULL){
        return -1;
    }

    state = OUT;
    count = 0;
    while ((ch = fgetc(fileHandle)) != EOF){
        if (isspace(ch))
            state = OUT;
        else if (state == OUT){
            state = IN;
            ++count;
        }
    }

    fclose(fileHandle);

    return count;  

}

I figured out how to deal with whitespaces, but I don't know how not to count combinations which don't have at least one alphabetic character (I know about isalpha and isdigit, but I have difficulty in understanding how to use them in my case).

I would really appreciate your help.

Drew McGowen
  • 11,471
  • 1
  • 31
  • 57
caddy-caddy
  • 205
  • 1
  • 5
  • 11
  • 1
    Separate out each word by testing characters using `isalnum()` which will keep alphabet and numerics. Then test the word to see if it has at least one alphabetical character by using `isalpha()`. – Weather Vane Apr 21 '15 at 12:59
  • 2
    To avoid the two pass proposed by @WeatherVane you can have a main parse to separate words. When you start the parsing of a new word, set a flag to false. During the parsing `flag |= isalph(c)` When you find a end of word, increment only if the flag is set to true – Ôrel Apr 21 '15 at 13:03
  • @Ôrel I was just keeping it simple - separating the tasks. – Weather Vane Apr 21 '15 at 13:10

1 Answers1

1

You can just replace:

else if (state == OUT){

with:

else if (state == OUT && isalpha(ch)){

So you set the state to IN at the first character and count it as word. Be aware that you count last.First as a single word, consider using (!isalnum(ch)) instead of (isspace(ch)).

enrico.bacis
  • 30,497
  • 10
  • 86
  • 115