2

In C, there are many posts concerning using fgetc after fscanf, dealing with an additional \n, but I am seeing another issue when using them in the reverse order; fscanf after fgetc.

When using fscanf after fgetc, I get a different fscanf-result to if I just omit fgetc (in the example script, just hard-coding num=1000 and commenting-out the block using fgetc).

I can replicate this correct fscanf-result while still using fgetc if I rewrite the file contents to the myFile variable, as in the below script. Removing this line produces the different incorrect fscanf-result.

What is causing the difference in the fscanf-result when using fgetc first, and how can I address the issue?

/* First read tab-delimited 4-digit int data from a text file,
 * parsing into an array of the correct size num, then compute
 * the mean of the array while avoiding overflow. */

#include <stdio.h>
#include <stdlib.h>

int main(){
    FILE *myFile;
    myFile = fopen("int_values.txt", "r");

    int c=0, i=0;
    float mean = 0.0;

    // Identifying there are 1000 values in the tab-delimited file.
    // Using fgetc
    int num = 1;
    while ((c=fgetc(myFile)) != EOF ){
        if (c == '\t')
            ++num;
    }

    int arr[num]; // Array of correct size for all values from file.

    // Redeclaring since fgetc seems to break fscanf
    myFile = fopen("int_values.txt", "r");

    // Read and store each value from file to array.
    for (i=0; i<num; i++){
        fscanf(myFile, "%d ", &arr[i]);
    }
    // Compute average cumulatively to avoid overflow.
    mean = arr[0]
    for (i=1; i<num; i++){
        //printf("In the %dth place, arr has value %d.\n", i, arr[i]);
        mean *= (float)i / (float)(i+1);
        mean += arr[i] / (float)(i+1);
    }
    
    fclose(myFile);

    printf("The overall mean of the %d values in the file is %f.\n\n", num, mean);
    return 0;
}
foam78
  • 264
  • 1
  • 8
  • Please show minimal versions of your files, this is part of a [mcve] – Jabberwocky Aug 17 '21 at 13:06
  • 2
    Also you use `fgetc` on `int_values.txt` and then you use `fscanf` on `seal_weights.txt`. And you don't fclose `myFile` after the while. Are you sure the code you posted is the code you're having issues with? – Jabberwocky Aug 17 '21 at 13:09
  • Hint: don't put al your code in main(){}. Create separate functions to read file1 and file2. Test them and call them and use their results. – wildplasser Aug 17 '21 at 13:26
  • 1
    `int arr[num]; // Array of correct size for all values from file.` is questionable as the number of tabs can differ quite a bit from the number of integers. Robust code does not assume input file is well formatted. – chux - Reinstate Monica Aug 17 '21 at 14:57
  • Please [edit] and explain why you're reading from two different files int_values.txt and seal_weights.txt. This doesn't make much sense. – Jabberwocky Aug 17 '21 at 15:30
  • @Jabberwocky: it's probably a 'conversion to SO question' oversight. There is a strong argument (IMO) that you should never pass a string literal as the file name to `fopen()` or its kin; you should always have a variable so you can pass the variable to the error reporting code too. Of course, this code doesn't check that `fopen()` succeeds — that's a separate bug. For an old-timer like me, the use of `fgetc()` grates, but it is actually reasonable these days. The multi-threaded locking on the file stream that POSIX demands means that the macro form `getc()` doesn't provide much benefit. – Jonathan Leffler Aug 17 '21 at 15:53
  • 1
    There *might* be something interesting to learn about `fgetc` and `fscanf` and why you can't call one after the other in this case. Or you could duck those issues (which are frustrating and obstinate and never end) and just follow this good, general rule: *don't*! When doing input in C, either use (f)scanf for everything, or use some combination of getchar/getc/fgets/getline for everything, but *don't try to mix them*. You *could* maybe get it to work, eventually, but life's too short, it's just not worth it. – Steve Summit Aug 17 '21 at 16:05
  • 1
    Thank you for all your very helpful comments. I am very sorry but @Jonathan Leffler is correct that both file names should have been ‘int_values.txt’, an oversight in converting for S/O — I have now edited this. Especially thank you @user3121023 as I believe `rewind` does indeed solve the issue I had in mind! – foam78 Aug 17 '21 at 18:55
  • You need to check return values of functions you call. Every. Single. Time. fopen? Check, report errors. fscanf? Check, report errors. fgetc? Check, report errors. Does it return a value? Check, report errors. Do that, and save yourself many many MANY hours of frustration. – n. m. could be an AI Aug 17 '21 at 18:57
  • And on a possibly related note, you do not need an array to read a file and average numbers in it. (If you go over your input sequentially once, looking at a single number at a time, you do not need an array). Consequently, you do not need to open the file twice, or count tabs, or use fgetc. Just read a number, add it to the running tally, and forget it. Do that until there are numbers in the file. – n. m. could be an AI Aug 17 '21 at 19:03
  • 1
    Instead of summing all the numbers, prone to numeric instability, you might want to check out, _eg_, [Welford's on-line algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm). – Neil Aug 17 '21 at 19:57

1 Answers1

3

Identifying there are 1000 values in the tab-delimited file.

Do not count tabs. Instead, read the ints. It's far too easy for the number of tabs to not relate correctly to the number of int.

Sum the int into a long long to avoid overflow. Use double for generic floating-point math.

#include <stdio.h>
#include <stdlib.h>

int main(void) {
  FILE *myFile = fopen("seal_weights.txt", "r");
  if (myFile) {
    int num = 0;
    long long sum = 0;
    int value;
    // return 1 on success, EOF on end-of-file, else 0 on non-numeric input
    while (fscanf(myFile, "%d", &value) == 1) {
      sum += value;
      num++;
    }
    double mean = num ? (double) sum / num : 0.0;
    printf("The overall mean of the %d values in the file is %f.\n\n", num,
        mean);

    // read in again and save values if truly desired.
    // This step not needed to compute average.
    rewind(myFile);
    int i;
    int arr[num];
    for (i = 0; i < num; i++) {
      if (fscanf(myFile, "%d", &arr[i]) != 1) {
        break;
      }
    }
    // Use arr[] in some way.

    fclose(myFile);
  }
  return 0;
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256