22

I would like to compare two files [ unsorted ] file1 and file2. I would like to do file2 - file1 [ the difference ] irrespective of the line number? diff is not working.

Balualways
  • 4,250
  • 10
  • 38
  • 51

4 Answers4

27

Well, you can just sort the files first, and diff the sorted files.

sort file1 > file1.sorted
sort file2 > file2.sorted
diff file1.sorted file2.sorted

You can also filter the output to report lines in file2 which are absent from file1:

diff -u file1.sorted file2.sorted | grep "^+" 

As indicated in comments, you in fact do not need to sort the files. Instead, you can use a process substitution and say:

diff <(sort file1) <(sort file2)
fedorqui
  • 275,237
  • 103
  • 548
  • 598
tonio
  • 10,355
  • 2
  • 46
  • 60
  • 5
    By the way, bash has a shortcut for the first three commands together: `diff <(sort file1) <(sort file2)`. – amalloy May 25 '13 at 03:41
27

I got the solution by using comm

comm -23 file1 file2 

will give you the desired output.

The files need to be sorted first anyway.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
Balualways
  • 4,250
  • 10
  • 38
  • 51
  • 1
    According to [`comm` man page](http://netbsd.gw.com/cgi-bin/man-cgi?comm++NetBSD-current), it works on lexically sorted files. Thus, you will have to use `sort` on your files first. – tonio Feb 09 '11 at 14:33
  • 1
    Like indicated by [amalloy's comment](http://stackoverflow.com/questions/4715885/compare-two-files-in-unix/4756123#comment24120178_4715952) for `diff`, you can also say `comm -23 <(sort file1) <(sort file2)`. – fedorqui Jul 28 '16 at 08:46
4

There are 3 basic commands to compare files in unix:

  1. cmp : This command is used to compare two files byte by byte and as any mismatch occurs,it echoes it on the screen.if no mismatch occurs i gives no response. syntax:$cmp file1 file2.

  2. comm : This command is used to find out the records available in one but not in another

  3. diff

Vivek Ji
  • 117
  • 1
  • 7
vishruti
  • 49
  • 1
2

Most easy way: sort files with sort(1) and then use diff(1).

gelraen
  • 238
  • 1
  • 8