1

I'd like to recursively compare two directory trees that are expected to be identical, but I don't want a full comparison which would take forever. I'd like to do an efficient comparison so that as soon as 1 difference is detected, the command stops and returns, and tell me which file was different.

What I consider to be a difference:

  • two files have different content (different timestamps doesn't matter)
  • a file was found in one directory but not the other (at the exact same path, of course)

Notes:

  • I don't need to know the actual differences within the file, just the filepath is enough
  • I tried diff 3.7 on Ubuntu 20.04, it doesn't have a "stop on difference" option that I could see
  • The files are a mix of text and binary
  • "diff -qr dir1 dir2" ? Only in /tmp/too2/ool2: new Only in /tmp/too1/ool2: old Files /tmp/too1/ool2/pp3/ff and /tmp/too2/ool2/pp3/ff differ – Dom Mar 04 '21 at 20:09
  • That's the diff command I tried. Perhaps my question wasn't clear enough? I don't want it to keep scanning after finding 1 difference. Here's a better way to think about it: let's say I have 1 million files. If it detects a difference on file #1000, I don't want it to scan another 999k files before exiting. – jerkstorecalled Mar 04 '21 at 20:14

1 Answers1

1

I don't know about a program which does this, but it is easy enough to write a small script calling diff and stopping upon the first difference. Like this:

#!/bin/bash

dir1="$1"
dir2="$2"

if [[ "*$dir1*$dir2*" == *"**"* ]]; then
    echo "Use: $0 dir1 dir2"
    exit 6
fi

while read line; do
    echo "The directory contents are different."
    exit 1
done < <(diff -qr "$dir1" "$dir2")
echo "The directory contents are the same."

This will pipe the output of diff into a read, which, upon receiving anything, exits the shell, killing the diff child process.

Lacek
  • 7,233
  • 24
  • 28