I have to compare checksum of all files in /primary
and /secondary
folders in machineA
with files in this folder /bat/snap/
which is in remote server machineB
. The remote server will have lots of files along with the files we have in machineA
.
- If there is any mismatch in checksum then I want to report all those files that have issues in
machineA
with full path and exit with non zero status code. - If everything is matching then exit zero.
I wrote one command (not sure whether there is any better way to write it) that I am running on machineA
but its very slow. Is there any way to make it faster?
(cd /primary && find . -type f -exec md5sum {} +; cd /secondary && find . -type f -exec md5sum {} +) | ssh machineB '(cd /bat/snap/ && md5sum -c)'
Also it prints out file name like this ./abc_monthly_1536_proc_7.data: OK
. Is there any way by which it can print out full path name of that file on machineA
?
ssh to remote host for every file definitely isn't very efficient. parallel
could speed it up by doing it concurrently for more files, but the more efficient way is likely to tweak the command a bit so it does ssh to machineB and gets all the md5sum in one shot. Is this possible to do?