1

I have several exports of telegram data and I would like to calculate the md5 and sha256 hash of all files but it only calculates those in the root directory

$ md5sum `ls` > hash.md5

md5sum: chats: Is a directory
md5sum: css: Is a directory
md5sum: images: Is a directory
md5sum: js: Is a directory
md5sum: lists: Is a directory
md5sum: profile_pictures: Is a directory

This is in the output file

7e315ce28aa2f6474e69a7b7da2b5886  export_results.html
66281ec07a2c942f50938f93b47ad404  hash.md5
da5e2fde21c3e7bbbdb08a4686c3d936  ID.txt

There is a way to get something like this out?

5750125fe13943f6b265505b25828400  js/script.js

Sorry for my english

4 Answers4

4

With bash:

shopt -s globstar
md5sum ** >/tmp/hash.md5

Ignore errors of the kind: md5sum: foobar: Is a directory

From man bash:

globstar: If set, the pattern ** used in a pathname expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a /, only directories and subdirectories match.

Cyrus
  • 84,225
  • 14
  • 89
  • 153
  • Don't get to use `**` enough!. But I assume that `**` is subject to cmd-line limitaions of the shell and if too many or too long filenames may generate the dreaded 'too many arguments' error. Just an FYI to O.P. Good luck to all! – shellter Aug 23 '20 at 15:45
  • @shellter: Yes, it may lead to "argument list too long" or something like that. – Cyrus Aug 23 '20 at 15:48
3

A tool which helps, but might not be installed by default, is hashdeep. hashdeep does it directly and has some more advantages, e.g. binary is available for Windows, too.

Your question would be answered using hashdeep with this command:

hashdeep -c md5,sha256 -r -o f -l . > hash.md5

This calculates md5 and sha256 of all files in all subdirs with one command.

Creating md5 and sha256 together might be faster due to caching effects of the files. Additionally the command has an option to use multiple threads, which could fasten up the task with multi-core CPUs and fast disks.

Marco
  • 824
  • 9
  • 13
2

Alternatively, you can use find with -exec option:

find topdir -type f -exec md5sum {} + > MD5SUMS

Replace the topdir with the actual directory name, or drop it if you want to work on the current directory (and its subdirectories, if any). This will only compute the checksums of regular files (so, no "md5sum: something: Is a directory" errors), and won't suffer from the "argument list too long" problem.

M. Nejat Aydin
  • 9,597
  • 1
  • 7
  • 17
  • `{ find . -maxdepth 1 -type f -execdir md5sum {} +;} >md5sums.txt` – Léa Gris Aug 23 '20 at 16:27
  • 1
    @LéaGris Why `-maxdepth 1`? The OP wants to compute checksums in subdirectories as well. And what's the point of curly brackets? – M. Nejat Aydin Aug 23 '20 at 16:33
  • Right, remove `-maxdepth 1`. The `-execdir` stays as it allow to compute the md5sums of all files in same directory at once. – Léa Gris Aug 23 '20 at 17:20
  • 1
    @LéaGris *The -execdir stays as it allow to compute the md5sums of all files in same directory at once*. So does `-exec`. I didn't understand your point. – M. Nejat Aydin Aug 24 '20 at 00:47
  • You are right, it just `cd` to the directory before feeding the files as arguments which is a penalty here, as it will run `md5sum` for each directory rather than once for all files if it fits the maximum arguments length. – Léa Gris Aug 24 '20 at 01:22
0

You could use the following to achieve the task.

find . -type f -exec md5sum {} + >> log_checksum.txt

#. (dot) can be replaced with the location you need to run the command

#curly braces {} to mention, filenames of the command output will be passed to the md5sum command as arguments

#(+) plus sign is added to make sure files are passed as arguments to a single md5sum command, to prevent running a separate md5sum process for each file
Du-Lacoste
  • 11,530
  • 2
  • 71
  • 51