Efficient way of determining whether 2 directories have ANY difference?

Question

I'd like to recursively compare two directory trees that are expected to be identical, but I don't want a full comparison which would take forever. I'd like to do an efficient comparison so that as soon as 1 difference is detected, the command stops and returns, and tell me which file was different.

What I consider to be a difference:

two files have different content (different timestamps doesn't matter)
a file was found in one directory but not the other (at the exact same path, of course)

Notes:

I don't need to know the actual differences within the file, just the filepath is enough
I tried diff 3.7 on Ubuntu 20.04, it doesn't have a "stop on difference" option that I could see
The files are a mix of text and binary

"diff -qr dir1 dir2" ? Only in /tmp/too2/ool2: new Only in /tmp/too1/ool2: old Files /tmp/too1/ool2/pp3/ff and /tmp/too2/ool2/pp3/ff differ — Dom, Mar 04 '21 at 20:09
That's the diff command I tried. Perhaps my question wasn't clear enough? I don't want it to keep scanning after finding 1 difference. Here's a better way to think about it: let's say I have 1 million files. If it detects a difference on file #1000, I don't want it to scan another 999k files before exiting. — jerkstorecalled, Mar 04 '21 at 20:14

score 1 · Accepted Answer · answered Mar 04 '21 at 22:06

I don't know about a program which does this, but it is easy enough to write a small script calling diff and stopping upon the first difference. Like this:

#!/bin/bash

dir1="$1"
dir2="$2"

if [[ "*$dir1*$dir2*" == *"**"* ]]; then
    echo "Use: $0 dir1 dir2"
    exit 6
fi

while read line; do
    echo "The directory contents are different."
    exit 1
done < <(diff -qr "$dir1" "$dir2")
echo "The directory contents are the same."

This will pipe the output of diff into a read, which, upon receiving anything, exits the shell, killing the diff child process.

Efficient way of determining whether 2 directories have ANY difference?

1 Answers1