2

I'm writing a program that deciphers sentences, syllables, and words given in a basic text file.

The program cycles through the file character by character. It first looks if it is some kind of end-of-sentence marker, like ! ? : ; or .. Then if the character is not a space or tab, it assumes it is a character. Finally, it identifies that if it is a space or tab, and the last character before it was a valid letter/character (e.g. not an end-of-sentence marker), it is a word.

I was a bit light on the details, but here is the problem I have. My word count is equal to my sentence count. What this interprets to, is it realizes that a word stops when there is an end of sentence marker, BUT the real problem is the spaces are considered valid letters.

Heres my if statement, to decide if the character in question is a valid letter in a word:

else if(character != ' ' || character != '\t')

I've already ruled out end-of-sentence markers by that point in the program. (In the original if actually). From reading off an Ascii table, 32 should be the space character. However, when i output all of the characters that make it into that block of code, spaces are in there.

So what am I doing wrong? How can i stop spaces from getting through this if?

Thanks in advance, and I have a feeling the question may be a bit vague, or poorly worded. If you have any questions or need clarification, let me know.

dmckee --- ex-moderator kitten
  • 98,632
  • 24
  • 142
  • 234
Blackbinary
  • 3,936
  • 18
  • 49
  • 62

3 Answers3

8

You should not rely on actual numbers for characters: that depends upon the encoding your platform uses, and may not be ASCII. You can check for any particular character by simply testing against it. For example, to test if c is a space character:

if (c == ' ')

will work, is easier to read, and is portable.

If you want to skip all white-space, you should use #include <ctype.h> and then use isspace():

if (isspace((unsigned char)c))

Edit: As others said, your condition to check for "not a space" is wrong, but the above point still applies. So, your condition can be replaced by:

if (!isspace((unsigned char)c))
Alok Singhal
  • 93,253
  • 21
  • 125
  • 158
4

I note that

(character != 32 || character != 9)

is always true. because if the character is 32 it is not 9, and true OR false is true...

You probably mean

(character != ' ' && character != '\t')
dmckee --- ex-moderator kitten
  • 98,632
  • 24
  • 142
  • 234
  • the && instead of || fixed my problem, the program now proceeds how it should. I do have another problem though, and you can see the edit on the original post for more details – Blackbinary Feb 05 '10 at 15:40
  • @Thomas: Because it was in the original code and---having spotted the logic error and the string literal thing---I had gotten busy typing and stopped thinking. Basic cut-n-paste error. Thanks. – dmckee --- ex-moderator kitten Feb 05 '10 at 23:03
0

It would probably be better to just compare against the specific characters you consider whitespace, also use an &&:

if ((character != ' ') &&
    (character != '\t'))
Mark Synowiec
  • 5,385
  • 1
  • 22
  • 18
  • Yes, i know that is a valid way. I tried this before the other way actually. But regardless of how i tell it to avoid characters that are spaces or tabs, it does not. – Blackbinary Feb 05 '10 at 15:35
  • @Blackbinary: Because you're checking the wrong thing: you can do: `if (c != ' ' && c != '\t')` etc., and it would work. – Alok Singhal Feb 05 '10 at 15:38
  • I agree with Alok, I didn't think about the code but every character is always going to be != ' ' OR != '\t'. I'll update my code, didn't catch that issue – Mark Synowiec Feb 05 '10 at 15:52