5

So I will start by saying this is for a course and I assume the professor won't really care that they are the same if cmp returns something weird. I am attempting to compare the output of my code, named uout, to the correct output, in the file correct0. The problem however is that it returns "cmp: EOF on uout". From a little bit of digging I found that EOF indicates they are the same up to the end of the shorter file with the shorter file being the one named after EOF, so what I gather from this is that they are the same until uout ends short. Problem is however, that it absolutely does NOT end short. When opening both in a text editor and manually checking spaces, line and column numbers, etc. everything was an EXACT match.

To illustrate my point here are the files copied directly using ctrl-a + ctrl-v:

correct0 http://pastebin.com/Bx7SM7rA

uout http://pastebin.com/epMFtFpM

If anyone knows what is going wrong and can explain it simply I would appreciate it. I have checked multiple times and can't find anything wrong with it. Maybe it is something simple and I just can't see it, but everything I have seen so far seems to suggest that the files are the same up until the "shorter one" ends, and oddly even if i switch my execution from

cmp correct0 uout

to

cmp uout correct0

both instances end up returning

cmp: EOF on uout
user2763113
  • 63
  • 1
  • 2
  • 6
  • 1
    Please post the code and show what should work, what does work and what does not work. – Weather Vane Mar 09 '15 at 19:42
  • 1
    Start off with `ls -l` on the two files. If the sizes are different, you know there's a problem (and `cmp` is giving the correct answer). If they're the same size, do a checksum on the two files (`md5sum` or something like that). If the hashes are different, so are the files; if the hashes are the same, the chances are that the files are the same. But one of these tests is going to show that the files are different. – Jonathan Leffler Mar 09 '15 at 20:00
  • ls -l is showing that there is literally a 1 byte difference between them but I have no idea how that could be if the characters, spaces, everything matches up exactly in an editor. – user2763113 Mar 09 '15 at 20:11
  • 1
    One file ends with newline; the other does not. Most likely, `correct0` ends with a newline and `uout` does not. – Jonathan Leffler Mar 09 '15 at 20:15

2 Answers2

9

The files you uploaded are same. It can be a line ending problem. DOS/Windows uses "\r\n" as a line ending, but Unix/Linux uses just a "\n".

The best utility on Linux machine for checking what your problem is, is "od" (octal dump) or any other command for showing files in their binary format. That is:

$ od -c uout.txt 
0000000   E   n   t   e   r       t   h   e       n   u   m   b   e   r
0000020   s       f   r   o   m       1       t   o       1   6       i
0000040   n       a   n   y       o   r   d   e   r   ,       s   e   p
0000060   a   r   a   t   e   d       b   y       s   p   a   c   e   s
0000100   :  \r  \n  \r  \n       1   6           3           2       1
0000120   3  \r  \n           5       1   0       1   1           8  \r
0000140  \n           9           6           7       1   2  \r  \n    
0000160       4       1   5       1   4           1  \r  \n  \r  \n   R
0000200   o   w       s   u   m   s   :       3   4       3   4       3
0000220   4       3   4  \r  \n   C   o   l   u   m   n       s   u   m
0000240   s   :       3   4       3   4       3   4       3   4  \r  \n
0000260   D   i   a   g   o   n   a   l       s   u   m   s   :       3
0000300   4       3   4  \r  \n  \r  \n   T   h   e       m   a   t   r
0000320   i   x       i   s       a       m   a   g   i   c       s   q
0000340   u   a   r   e
0000344

As you can see, here the line endings are \r\n. Since you have opened and copy pasted the files, this represents your machines preferences and not the actual fiels line ending. Also you can try dos2unix utility to convert line endings.

Jadi
  • 729
  • 5
  • 5
  • @user2763113: It appears that the file doesn't end with a line ending (CRLF). You should ensure that your (text) files do always end with CRLF (if you're working on Windows; LF if you're working on Unix). – Jonathan Leffler Mar 09 '15 at 20:04
  • But what I don't get is that both files were created on a linux machine, from writing my c program to getting the output file with ./a.out < input > uout. Why would the windows endings be on there? – user2763113 Mar 09 '15 at 20:07
  • Actually I just thought of something. I compiled using gcc but maybe we need to use c99 and that might make a difference? I will try it and report back. UPDATE: It made no difference. ls -l shows a filesize off by a single byte (correct0 is 218 while uout is 217) – user2763113 Mar 09 '15 at 20:09
  • 1
    @user2763113: so you need to run `od -c` on the two files. My best guess would be that `correct0` ends with a line feed and `uout` does not -- a problem that is easy to fix. In future, make sure your `printf()` statements end with a newline unless you are consciously building up a single line with multiple operations. – Jonathan Leffler Mar 09 '15 at 20:14
  • THANK YOU SO MUCH JONATHAN, It turned out that throwing in a newline at my very last printf made them match. I suspected it was a hidden character but for some reason it seemed stupid to me at the time because I assumed it would show the new line in a text editor, which it does not. That fixed it for me. Again thank you so much, this was driving me bonkers. – user2763113 Mar 09 '15 at 20:29
  • This helped the OP, but is not generally applicable to people getting this error from `cmp`. I'm getting that same `cmp: EOF on ` when comparing two files where one is longer than the other. In other cases, it has not done this, and I would consider that an expected scenario -- the whole point of this utility is to compare files, so different lengths should be tolerated and reported, causing the program to exit with an error identifying the line and character at which the difference occurs. It should NOT cause the program to crash and fail to report any differences! – JakeRobb Jul 20 '22 at 20:00
4

If the files are human readable I would use diff tool instead. It has ways to ignore the line endings(see the --ignore-space-change and --strip-trailing-cr and --ignore-blank-lines).

diff -u --ignore-space-change --strip-trailing-cr --ignore-blank-lines test_cases/correct0 test_cases/uout0
KRoy
  • 1,290
  • 14
  • 10