1

I am trying to rename all the files in a folder per their CRC32 value.

I am basing work from this post: Rename files to md5 sum + extension (BASH)

md5sum * | sed -e 's/\([^ ]*\) \(.*\(\..*\)\)$/mv -v \2 \1\3/e'

I have minimal understanding of sed, and tried teaching myself enough regex to reverse engineer what is going on but can't seem to understand. I am using bash and the crc32 function to achieve this.

I would appreciate help on this and would appreciate it even more if somebody had the time to break this down and help me understand.

nothing
  • 77
  • 1
  • 5
  • 1
    How about you break it down yourself, one command after another, try each command separately, check each command's manual, try to understand by this way? Read `man md5sum` and `man sed`. Also please read [how-to-ask](https://stackoverflow.com/help/how-to-ask). – Til Mar 21 '19 at 02:11
  • 1
    MD5 is a [cryptographic hash function](https://en.wikipedia.org/wiki/Cryptographic_hash_function), not a [CRC](https://en.wikipedia.org/wiki/Cyclic_redundancy_check) – jhnc Mar 21 '19 at 04:17

2 Answers2

3

Here is a step by step explanation:

$ ls -1
abc.txt
def.txt
ghi.txt

$ crc32 *
c7e06c1a        abc.txt
042999b4        def.txt
e686c130        ghi.txt

$ crc32 * | sed -e "s/^\(\S*\)\s*\(.*\(\..*\)\)$/mv -v \2 \1\3/g"
mv -v abc.txt c7e06c1a.txt
mv -v def.txt 042999b4.txt
mv -v ghi.txt e686c130.txt

what happens in detail:

s/                  # substitute the following expression
^                   # begin of line
\(\S*\)             # store every char until whitespace  (\1)
\s*                 # whitespace
\(.*                # store every char...                (\2)
\(\..*\)            # until '.', store it extra          (\3)
\)                  # end brace of \2
$                   # end of line
/mv -v \2 \1\3      # command with stored arguments
/g                  # global, on the whole line

to perform in one step, replace the "g" with an "e"

$ crc32 * | sed -e "s/^\(\S*\)\s*\(.*\(\..*\)\)$/mv -v \2 \1\3/e"
renamed 'abc.txt' -> 'c7e06c1a.txt'
renamed 'def.txt' -> '042999b4.txt'
renamed 'ghi.txt' -> 'e686c130.txt'

if you are not using gnu sed, remove "e" and add "| sh"

crc32 * | sed -e "s/^\(\S*\)\s*\(.*\(\..*\)\)$/mv -v \2 \1\3/" | sh
UtLox
  • 3,744
  • 2
  • 10
  • 13
  • Thank you, your answer really helped me understand. I did have a few follow up questions though. From what I've seen from just learning regex grouping with "(" and ")" don't require backslashes; is it required here because it needs to be escaped for bash? Why is group (\3) nested in group (\2), would it be possible to write it so that they we're their own things? (I understand it works as it, I just wonder for the sake of understanding.) – nothing Mar 21 '19 at 23:15
  • 1
    The brackets are supposed to prevent the bash from misinterpreting them. The arguments do not have to be nested. You can also read them separately but you have then to adjust the rest of the instruction. Many ways lead to the goal ;-) – UtLox Mar 22 '19 at 07:09
1

This might work for you (GNU parallel):

crc32 * | parallel --plus -C '\t' mv -v {2} {1}.{2+.}
potong
  • 55,640
  • 6
  • 51
  • 83