2

I'm running into a problem with reading a txt file within a Windows NT 4.0 dll file; and before you ask, I'm not currently interested in migrating this to a new OS. I just want to fix this one issue and let others after me worry with migrating this super-legacy software.

The problem occurs when I read a txt file using fscanf, as shown:

infile_ptr = fopen("c:\\LumaGem\\orbit.txt", "r");
byteoffset=0;
while(!feof(infile_ptr) )
    {   
        r=0.0; s1=0.0; s2=0.0; e1=0.0; e2=0.0; e3=0.0; d=0.0; f=0.0;
        fseek(infile_ptr, byteoffset, SEEK_SET);
        fscanf(infile_ptr,"%7lf %7lf %7lf %7lf %7lf %7lf %7lf %7lf", &r, &s1, &s2, &e1, &e2, &e3, &d, &f);

        byteoffset=0; byteoffset = ftell(infile_ptr);
     }
fclose(infile_ptr);  

The txt file, created with MATLAB, consists of 128 rows of 8 columns separated by 5 spaces and formatted like so within MATLAB:

fprintf(fid,'%7.3f     %7.3f     %7.3f     %7.3f     %7.3f     %7.3f     %7.3f     %7.3f \n', variables);

This code was not written by me, and worked for several years. However, recently we had to rebuild/reinstall the Windows NT 4.0 OS and software, and now I get a strange error. The program reads the txt file fine using the code provided at the top until it gets to line 123, at which point it reads the 8th column twice, resulting in all of the subsequent variables to be shifted by one position, completely screwing up the last few lines of the program. Interestingly enough, this problem can be overcome by manually copying and pasting the first 123 lines in bulk to a new txt file and then the last several lines one-by-one into the same new txt file and using that as the input (copying done on the NT machine within WordPad). Doing so eliminates this double-read issue. I have no idea what problems can cause this error, but also let it be fixed by such a weird/clunky method. The problem happens with new and old inputs, so I don't think the input files are the issue since they haven't changed.

Oh, and additionally, if I change the number of spaces between each column in the txt file, the location of the error shifts. Reducing it to 1 space causes the error to occur at line 120 or so, while increasing the number of spaces (tried 7 instead of 5) pushed the error down to line 124.

I'm no programming expert (always been a learn-as-I-need-it guy), so help figuring this out would be greatly appreciated. Thanks!

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
sdm142
  • 23
  • 5
  • Examine the original file with a hex viewer (e.g. open as binary in Visual Studio), look for unusual bytes near where the problem occurs. I would suspect an embedded NUL (zero) byte, or ^Z (aka 26 aka 1A). The latter is treated as an end-of-file marker when reading in text mode. Your strange `ftell`/`fseek` dance might allow the code to get past it and continue reading. – Igor Tandetnik Jul 19 '13 at 19:04
  • Showing your text file lines 122 - 124 would help. – chux - Reinstate Monica Jul 20 '13 at 02:55
  • @IgorTandetnik I will try out the hex/binary comment on Monday when I get back to my lab. It doesn't seem likely, though, since older input file which worked fine before demonstrate the same problem. Could the unusual byte be added during transfer from one machine to another? – sdm142 Jul 20 '13 at 14:34
  • @chux Here are those 3 lines of the txt file: Here are example input lines for a txt file: 8.297 0.045 -5.000 340.313 22.624 0.000 5.000 2.000 8.363 0.045 -5.000 343.125 20.744 0.000 5.000 2.000 8.430 0.045 -5.000 345.938 19.073 0.000 5.000 2.000 – sdm142 Jul 20 '13 at 14:34
  • Numbers look OK. May you provide a hexdump of the trouble line along with its 2 or 4 neighbors? BTW what type is byteoffset? I suspect your return value from fcanf near line 123 is not 8 and _that_ will demarcate your problem. – chux 15 mins ago – chux - Reinstate Monica Jul 20 '13 at 15:58
  • Hex dump of input file: Edited: Trying to figure out how to copy and paste the hex dump; wants to just paste the original input values... The `byteoffset` is a `long`. I'll look into the fscanf return value for each line, but won't the line always read 8 if it's returning 8 values? The problem isn't missing a value, it's double reading a value. – sdm142 Jul 22 '13 at 13:03
  • Line 122: '20 20 20 38 2e 34 33 30 20 20 20 20 20 20 20 30 2e 30 34 35 20 20 20 20 20 20 2d 35 2e 30 30 30 20 20 20 20 20 20 31 34 2e 30 36 33 20 20 20 20 20 20 31 39 2e 30 37 33 20 20 20 20 20 20 20 30 2e 30 30 30 20 20 20 20 20 20 20 35 2e 30 30 30 20 20 20 20 20 20 20 32 2e 30 30 30 20 0a ' Line 123: '20 20 20 38 2e 33 36 33 20 20 20 20 20 20 20 30 2e 30 34 35 20 20 20 20 20 20 2d 35 2e 30 30 30 20 20 20 20 20 20 31 36 2e 38 37 35 20 20 20 20 20 20 32 30 2e 37 34 34 20 20 20 20 20 20 20 30 2e 30 30 30 20 20 20 20 20 20 20 35 2e 30 30 30 20 20 20 20 20 20 20 32 2e 30 30 30 20 0a ' – sdm142 Jul 22 '13 at 13:11
  • Line 124: '20 20 20 38 2e 32 39 37 20 20 20 20 20 20 20 30 2e 30 34 35 20 20 20 20 20 20 2d 35 2e 30 30 30 20 20 20 20 20 20 31 39 2e 36 38 38 20 20 20 20 20 20 32 32 2e 36 32 34 20 20 20 20 20 20 20 30 2e 30 30 30 20 20 20 20 20 20 20 35 2e 30 30 30 20 20 20 20 20 20 20 32 2e 30 30 30 20 0a' – sdm142 Jul 22 '13 at 13:12
  • @sdm142 Thanks. Too many ideas follow: Notes: Hexdump 1st column (8.430:8.363:8.297) and text dump (2 days ago 8.297:8.363:8.430) change in the opposite order. You report only columns 1,5,6 change. I see 1,4,5 change. Column 4 values of 14 to 345 noted, Are you confident that column does not exceed 999.999? Have you tried "%lf%lf%lf%lf%lf%lf%lf%lf " rather than %7lf...? Is your last line read an incomplete read? – chux - Reinstate Monica Jul 22 '13 at 15:15
  • You may need to provide a link to the original text file. I'm running out of ideas other than the manipulation of the file pointer is mis-behaving. Of course the issue _could_ be in unposted code. See new posted answer below. Good luck! – chux - Reinstate Monica Jul 22 '13 at 15:47
  • Sorry, I meant that columns 1, 4, 5 change, as you point out. The largest value in any of these input files is 357.188; all values are less than 360 because this is a trajectory file for some motors. The -5.000 is the only negative value, and it's fixed. Here are the input files. The "Normal" one is the one that has an error at the end of line 123, the "Manual" one is the copy-pasted version I made in Wordpad on Windows NT 4.0 that works fine. http://people.duke.edu/~sdm36/PROJSINE_Manual.txt http://people.duke.edu/~sdm36/PROJSINE_Normal.txt – sdm142 Jul 22 '13 at 16:24

2 Answers2

1

Trouble with your fscanf() directive.
Recommend %lf instead of %7lf.

Your fprintf() with "%7.3f" prints out floating point numbers using at least 7 characters to do so, padding with ' ' as needed.

Your subsequent use of "%7lf" in fscanf() says to scan at most 7 characters. So when you printf/scanf 999.999 all is OK, but with numbers greater, such as 1000.007, your scanning takes in "1000.00" and leaves the "7" for the next "%7lf".

 int main(void) {
  char buf[1000];
  double f1, f2;
  int r;
  sprintf(buf, "%7.3f %7.3f", 1.23, 4.56);
  r = sscanf(buf, "%7lf %7lf", &f1, &f2);
  printf("'%s'\n%d %g %g\n", buf, r, f1, f2);

  sprintf(buf, "%7.3f %7.3f", 999.999, 4.56);
  r = sscanf(buf, "%7lf %7lf", &f1, &f2);
  printf("'%s'\n%d %.10g %.10g\n", buf, r, f1, f2);

  sprintf(buf, "%7.3f %7.3f", 1000.007, 4.56);
  r = sscanf(buf, "%7lf %7lf", &f1, &f2);
  printf("'%s'\n%d %.10g %.10g\n", buf, r, f1, f2);

  return 0;
}

Output:  
'  1.230   4.560'  
2 1.23 4.56  
'999.999   4.560'  
2 999.999 4.56  
'1000.007   4.560'  
2 1000 7  

BTW: For fscanf(), "%lf%lf%lf ..." is OK. Adding spaces in between the %lf does not change functionality.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • I don't think the %7lf is the problem, since the column values are always between 0 and 999.999 in value. – sdm142 Jul 20 '13 at 14:29
  • I took a shot - BTW your example values include -5.000, which is outside range 0 and 999.999. If you use negatives, your %7lf limiting range is -99.999 and 999.999. – chux - Reinstate Monica Jul 20 '13 at 14:43
  • Sure, I appreciate any ideas/help. And those negative values do not change. Only columns 1, 5, and 6 change; the rest are constant for all inputs. – sdm142 Jul 20 '13 at 16:51
0

Candidate simplification

Have not found the source of the issue, but recommended this to simplify debugging. This takes care of PC line endings, excessive manipulation of the file pointer, unneeded limitation of %7lf and provides better error checking.

FILE *infile_ptr = fopen("c:\\LumaGem\\orbit.txt", "rt");  // PC text file
char buf[1000];
while (fgets(buf, sizeof(buf), infile_ptr)) {  // separate I/O from scanning
  int count = sscanf(buf,"%lf%lf%lf%lf%lf%lf%lf%lf", &r, &s1, &s2, &e1, &e2, &e3, &d, &f);  
  if (count != 8) { // check for correct scan count
    ; //handle error;
  }
}
if (ferror(infile_ptr)) {
  ; //handle error;
}
fclose(infile_ptr);

[Edit] OP posted original text file.

Also add this line to the loop to take care of trailing lines consisting of only spaces.

  if (count <= 0) continue;

[Edit] Conclusion.

The use of fseek() and ftell(), usually used with binary files, not needed here IMHO, is a Window's bug reading a UNIX text file (\n) opened in Windows in binary mode "r" and using fscanf() which works best reading text files open "rt". Nothing found to point to line 123 or its neighbors as a problem area.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • The `fprintf()` uses a `\n` which `fscanf()` typically does not distinguish from a space. By using `fgets()`, which uses '\n', issues are localized to the offending line. – chux - Reinstate Monica Jul 22 '13 at 15:50
  • Implementing this works to read the file, but now I'm running into other issues; I'm starting to think this may be memory related. I've included a link to the source code for this DLL (no headers/etc. included, though). I'm sure it's sloppy (again, I didn't program most of it, but I did adapt some of it to work with our rebuild), but I'd appreciate it you (or anyone) could look at it and spot any obvious errors, specifically with function `int ReadEuler(int, index)`. Happy to answer any questions about the purpose of it. http://people.duke.edu/~sdm36/DLL_NewXPS_v1_StackEdit.cc – sdm142 Jul 22 '13 at 19:54
  • To add, this computer has 1GB of RAM if that makes any difference regarding memory/etc. – sdm142 Jul 22 '13 at 19:57
  • The `fclose( CommandList );` in your complete code after detecting an error should be replaced by a `break;` in 2 places. BTW: why `rdeg = 9.7773f*pow(r,6)` vs. `rdeg = 9.7773 *pow(r,6)`? The second must be faster (or as fast) and more accurate. BTW: Your 2nd text file uses \r\n whereas the first file uses \r. – chux - Reinstate Monica Jul 22 '13 at 20:54
  • The `CommandList` outputs are not part of the original code; I just put those there for to debug and find what the computer was reading/storing/sending vs what it was supposed to be doing. The rdeg format was created by the initial programmer; no idea why it's in that particular format, so I will update. Good to know about the txt file differences. But, even with the implemented code from this answer, I'm now having new issues (I think with converting the stored variables into strings). Will look at it more tomorrow. Thanks again for being so helpful! – sdm142 Jul 22 '13 at 22:07
  • New issues sounds like _progress_. TTFN – chux - Reinstate Monica Jul 22 '13 at 22:42