0

I have quite a few FTP folders, and I add a few each month and prefer to leave some sort of method of verifying their integrity, for example the files MD5SUMS, SHA256SUMS, ... which I could create using a script. Take for example:

find ./ -type f -exec md5sum $1 {} \;

This works fine, but when I run it each time for each shaxxx sum afterwards, it creates a sum of the MD5SUMs file which is really not wanted.

Is there a simpler way, or script, or common way of hashing all the files in to their sums file without causing problems like that? I could really use a better option.

Kennie R.
  • 3
  • 2

3 Answers3

2

Are you saying the problem is you are re-running the md5sum on the generated file? You could just skip those files. And of course, use gnu parallel to speed things up:

find . -type f -a \! -name MD5SUMS | parallel -j+0 "md5sum {} >>MD5SUMS"

I feel however from your description that I'm missing something though.

EDIT: corrected redirection and added xargs info:

Note you don't have to use parallel, xargs works fine too (I just think it's fun to try parallel). Here's the equivalent xargs invocation:

find . -type f -a \! -name MD5SUMS -print0 | xargs -0 sum >> MD5SUMS
Phil Hollenback
  • 14,947
  • 4
  • 35
  • 52
  • Edit: I think you are right, although I do seem to not have parallel on my Ubuntu 10.10 server. – Kennie R. Feb 14 '11 at 06:03
  • Bah, installed parallel but it seems to only sum one file and list the others nonsummed, strange. – Kennie R. Feb 14 '11 at 06:16
  • Sorry I typoed the command, the `>` overwrites the file every time. Needs to be `>>`. – Phil Hollenback Feb 14 '11 at 06:53
  • Ah, excellent - thank you. The xargs one barfs at filenames with spaces in them, that is unfortunate, but the parallel one works fine. Hey , Learned a neat tool, all the better! – Kennie R. Feb 14 '11 at 07:11
  • I fixed the xargs invocation to work with files with spaces in them by using `-print0/-0`. – Phil Hollenback Feb 14 '11 at 07:24
  • I think you meant: find . -type f -a \! -name MD5SUMS -print0 | xargs -0 md5sum -b >> MD5SUMS –  Apr 05 '11 at 21:29
0

I've had a need for verifying integrity of backups/mirrors which contain a large number of files and ended up writing a command-line program called MassHash. It's written in Python. A GTK+ Launcher is also available. You may want to check it out...

http://code.google.com/p/masshash/

Jonathan
  • 11
  • 1
  • Good idea but in it's current form, I tried to set it up but got "OSError: [Errno 2] No such file or directory", despite being able to get the GUI up fine. Btw better installation instructions could be: wget http://example.com/example.py; python example.py (done) – Luke Stanley Oct 23 '12 at 05:38
0

Try md5deep

sudo apt-get install md5deep
md5deep -rel "test_directory" > results_file.md5

"This is the command I will run against the directory to check for any changes."

md5deep -X list.txt -r Pictures/

From http://linhost.info/2010/05/compare-hashes-with-md5deep-part-2/ :

Luke Stanley
  • 161
  • 1
  • 8
  • Unable to install on amazon linux. `sudo yum install md5deep` gives this error - `No package md5deep available. Error: Nothing to do` – Sandeepan Nath Oct 21 '16 at 09:42