50

Can I use the diff command to find out how many lines do two files differ in?

I don't want the contextual difference, just the total number of lines that are different between two files. Best if the result is just a single integer.

scarecrow
  • 69
  • 1
  • 8
gmmo
  • 2,577
  • 3
  • 30
  • 56
  • 3
    possible duplicate of [How to count differences between two files on linux?](http://stackoverflow.com/questions/1566461/how-to-count-differences-between-two-files-on-linux) – Chris Maes Dec 01 '14 at 20:45

2 Answers2

95

diff can do all the first part of the job but no counting; wc -l does the rest:

diff -y --suppress-common-lines file1 file2 | wc -l

digitmaster
  • 951
  • 1
  • 6
  • 2
36

Yes you can, and in true Linux fashion you can use a number of commands piped together to perform the task.

First you need to use the diff command, to get the differences in the files.

diff file1 file2

This will give you an output of a list of changes. The ones your interested in are the lines prefixed with a '>' symbol

You use the grep tool to filter these out as follows

diff file1 file2 | grep "^>"

finally, once you have a list of the changes your interested in, you simply use the wc command in line mode to count the number of changes.

diff file1 file2 | grep "^>" | wc -l

and you have a perfect example of the philosophy that Linux is all about.

Community
  • 1
  • 1
Zhilong Jia
  • 2,329
  • 1
  • 22
  • 34
  • 3
    This will not get lines that are in file1, but not file2, for example if file1 is "hello", and file2 is a blank file, the diff will just be "< hello", so your script will output 0, even though the files are different. – Andrew Nguyen Nov 16 '15 at 21:57
  • @AndrewNguyen Here it's related with how to define the difference of lines. – Zhilong Jia Nov 17 '15 at 13:06
  • 2
    This approach has a few problems: First, it will only find lines added by the `file2`, not those added by `file1`. The second issue is that even if the user looked for both `<` _and_ `>`, it wouldn't provide much clarity regarding which lines were merely _changed_ (which `diff -y` models with a `|` character.) – Christian Convey Mar 27 '17 at 22:01
  • Why not just let grep count and let's count the difference regardless of which file has the difference. `diff file1 file2 | grep -c -e "^<" -e "^>"` In this case, if you reverse the order of the files, you receive the same count. – Jim Aug 12 '19 at 12:09