BASH: dealing with duplicates using md5sum

Question

I managed to separate hash from file path while finding duplicates in my directory. My next task is to print only the duplicates (ex: 3 files, 2 duplicates).

What I did so far is placing the output in an array like this:

arr=( $(find $1 -type f -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate) )

I need to count the number of duplicates of each file(not original), get the size of each and list their paths respectively.

I tried to do a for loop in my arr, but I can't compare hashes, it gives me the value too great for base error.

Any tips would be great.

SOLVED

for ((i = 0 ; i < ${#arr[@]} ; i++ ))
do 
//...compare here
done

What does your for loop look like? How are you comparing hashes? — Digital Trauma, Oct 28 '13 at 04:15
"value too great for base" sounds like you're trying to compare the hashes as integers, while what you should be doing is string compares. As @DigitalTrauma said, show your code. — Gordon Davisson, Oct 28 '13 at 04:37
My problem is actually solved I had to use the forloop instead of: for i in arr. Then I simply use ${arr[i]} to iterate and then compare! — Atieh, Oct 28 '13 at 11:56

score 0 · Accepted Answer · answered Dec 31 '13 at 10:16

0

This is the solution for looping:

for ((i=0 ; i<${#arr[@]} ; i++ ))
do 
//...compare here
done

make sure to keep the spaces when initializing 'i' and comparing it..

answered Dec 31 '13 at 10:16

Atieh

230
2
16

BASH: dealing with duplicates using md5sum

1 Answers1