-2

I am trying to test if the character in a file.txt is a space ' ' or not using this code:

char *Appartient (FILE *f, char *S)
{
    int i = 0, nbdechar = 0, nbocc = 0, PosdePremierChar, space = 0;
    char c;
    while ((c = getc(f)) != EOF) {
        PosdePremierChar = ftell(f);
        if (c == S[0]) {
            nbdechar = 0;
            for (i = 1; i < strlen(S); i++) {
                c = getc(f);
                if (c == S[i]) {
                    nbdechar++;
                }
            }
            if (nbdechar == strlen(S) - 1) {
                nbocc++;
            } else {
                rewind(f);
                fseek(f, PosdePremierChar - 1, SEEK_CUR);
                while ((c = getc(f)) != ' ');
            }
        } else {
            while ((c = getc(f)) != ' ') {
                space++;
            }
        }
    }
    printf("\n Le nb d'occurence est %d", nbocc);
    if (nbocc == 0) {
        return "false";
    } else {
        return "true";
    }
}

but a weird symbol 'ے' appear like a garbage when I inspect the variable 'c' in my debugger:

screenshot of debugger showing that variable 'c' has the value -1, which is interpreted as the character 'ے'

What is wrong

zwol
  • 135,547
  • 38
  • 252
  • 361
Programmer
  • 54
  • 1
  • 2
  • 9
  • 1
    Absolutely unclear what you are asking! What is debug mode? how does the character appear, by itself? I doubt it, you didn't post the coed that prints the character, the posted code is of no use at all, it's impossible to know if the `f` was opened or not, the only thing that is very evident in the posted code, is that it has ugly formatting. – Iharob Al Asimi May 24 '15 at 14:30
  • The debug mode let me see the value of c every time the while works, it works step buy step , the question is why i character space it is not consider like a character space :) – Programmer May 24 '15 at 14:33
  • The "weird symbol" is [U+06D2 ARABIC LETTER YEH BARREE](http://www.fileformat.info/info/unicode/char/06d2/index.htm). It is possible for that character to be in your file, but normally it is *not* possible for `getc` to set a variable to 0x06D2. We need to see your *entire program* and we also need to know what OS, compiler, and IDE (if any) you are using, and *exactly* how this "debug mode" works - by changing the code? single-stepping in a debugger? (Which debugger?) And finally it would be helpful to see a file that provokes this problem. – zwol May 24 '15 at 14:39
  • 1
    @zwol that is not an arabic, it's persian it's not the same, I see that the link claims it's arabic. – Iharob Al Asimi May 24 '15 at 14:40
  • 2
    We also need to see the declaration of c. Is it an int, a char, a wchar_t, something funny? – Jens May 24 '15 at 14:43
  • i post the code in a comment please check it – Programmer May 24 '15 at 15:08
  • @iharob fileformat.info and I give the official Unicode name for the character; it is my understanding that all the letters in the family of related scripts used to write Arabic, Persian, Urdu, etc. are uniformly labeled "ARABIC LETTER xxx" by the Unicode standard. I will take your word for it that this particular one is only used to write Persian. – zwol May 24 '15 at 22:06
  • @Programmer Thank you for providing enough code to make the problem clear. I have reformatted it for you so that we can read it. Proper code formatting is *critical* to readability. – zwol May 24 '15 at 22:23
  • @zwol that makes sense since I am almost sure that their origin is the origin of the arabic language. – Iharob Al Asimi May 24 '15 at 22:23

3 Answers3

3

Could be the result of converting the end-of-file result from getc(), EOF, (which is standardized to be negative, often -1) to a character.

Note that your loop never terminates if there's no space in the file, since EOF != ' ' and that condition keeps being true after you hit end-of-file for the first time.

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
Jens
  • 69,818
  • 15
  • 125
  • 179
  • there are spaces int the file i am not trying to test the EOF i am trying to test the space character that tell me a new word will appear – Programmer May 24 '15 at 14:38
  • 1
    @Programmer but you must test for `EOF` because if you don't then the loop will never end. – Iharob Al Asimi May 24 '15 at 14:39
  • @programmer Then you should show us your actual program, not something you made up. There's no statement printing `c='x'` anywhere. And you should test the return from `fopen()` as well because it fails easily. – Jens May 24 '15 at 14:41
  • @Jens What you said is true but I would expect the unexpected character to be 'ÿ' ([U+00FF LATIN SMALL LETTER Y WITH DIAERESIS](http://www.fileformat.info/info/unicode/char/00ff/index.htm)) in that case. – zwol May 24 '15 at 14:41
  • @zwol why? do you know what encoding is the OP using? – Iharob Al Asimi May 24 '15 at 14:41
  • @Jens I suspect that the `c='x'` thing is from the debugger window, which could have an encoding where `-1` is the persian character shown. – Iharob Al Asimi May 24 '15 at 14:42
  • @iharob I was assuming 'ے' would only appear if the actual Unicode codepoint value (0x0006D2) had somehow gotten into the variable, so this *wasn't* a case of the usual EOF-truncation problem, but apparently [Windows codepage 1256](https://en.wikipedia.org/wiki/Windows-1256) really does assign byte 255 to that character, and the OP is using that codepage for his or her debugger's UI. Ya learn something new every day. – zwol May 24 '15 at 22:12
1

Modify your code like this, trace it and you might become enlightened regarding the relation between what getc() returns and how this correlates to chars:

#include <stdlib.h>
#include <stdio.h>


int main(void)  
{
  int result = EXIT_SUCCESS;

  FILE * f = fopen("test.txt", "r");
  if (NULL == f)
  {
    perror("fopen() failed");
    result = EXIT_FAILURE;
  }
  else 
  {
    int result = EOF;

    while (EOF != (result = getc(f)))
    {
      char c = result;

      printf("\n%d is 0x%02x is '%c'", result, result, c);
      if (' ' == c)
      {
        printf(" is space ");
      }
    }

    printf("\nread EOF = %d = 0x%x\n", result, result); 

    fclose(f);
  }

  return result;
}
alk
  • 69,737
  • 10
  • 105
  • 255
0

You didn't test if f opened, in case it didn't then undefined behavior will happen, check if the file opened

FILE *file;
int   chr;

if ((file = fopen("test.txt", "r")) == NULL)
 {
    fprintf(stderr, "Cannot open `test.txt'\n");
    return -1;
 }

while (((chr = fgetc(file)) != EOF) && (chr == ' '))
    printf("space\n");

You should declare chr of type int, because fgetc() returns an int, as for example EOF requires to be an int and not a char.

Also, debug mode is useful for tracking the values of variables, I bet that it can five you the value in ascii or decimal or hex, as you need if you know how to ask.

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
  • what is the difference between fgetc and getc when i use a file ? and i change the c to an int but he compare now the code ASCII of the character with another character .. now its an infinity loop – Programmer May 24 '15 at 15:07
  • It's an [implementation detail](http://man7.org/linux/man-pages/man3/ungetc.3.html) that is pretty irrelevant to your probelem. – Iharob Al Asimi May 24 '15 at 15:09
  • thanks a lot but how i will let the compiler test the character with space ? – Programmer May 24 '15 at 15:14
  • What you mean? the compiler doesn't care about that, it's your program who should care. – Iharob Al Asimi May 24 '15 at 15:19
  • i just wanna know why the character space is not tested like other characters please help me – Programmer May 24 '15 at 15:21
  • 2
    You did not understand anything from the answer, it is tested, it's just that when `fgetc()` or `getc()` whatever, returns `EOF` the test is true, and it keeps returning `EOF` forever, therefor you **MUST** test for `EOF` too, not just for `' '`. Test for both, not just one of them. – Iharob Al Asimi May 24 '15 at 15:24