1

I have multiple directories containing multiple images, and some of these directories have duplicate images. I want to find all duplicate images within the same directory and delete them. Below is my code.

I'm having a problem with deleting the duplicate image. the code can identify duplicate files, but when it tries to delete it it shows this message"rm: cannot remove 'FILENAME': No such file or directory"

for dir in *; do
count=1
for file in $dir/*.*; do
     md5sum * | sort | awk 'BEGIN{lasthash = ""} $1 == lasthash {print $2} {lasthash = $1}' | xargs rm
     let count=count+1
done
done
user2334436
  • 949
  • 5
  • 13
  • 34

1 Answers1

0

The following excerpt from xargs manpage might explain what you see:

find /tmp -name core -type f -print | xargs /bin/rm -f

Find files named core in or below the directory /tmp and delete them. Note that this will work incorrectly if there are any filenames containing newlines or spaces.

 find /tmp -name core -type f -print0 | xargs -0 /bin/rm -f

Find files named core in or below the directory /tmp and delete them, processing filenames in such a way that file or directory names containing spaces or newlines are correctly handled.

If a file has a name with spaces, say my vacation in thai.jpg, what xargs does by default is to split it at spaces and invoke multiple rm:

rm my
rm vacation
rm in
rm thai.jpg

You need to make awk print null-terminated strings and use xargs -0 to consume them. In this question: How can I output null-terminated strings in Awk? it is suggested to use this line:

  awk '{printf "%s\0", $0}'
Grigory Rechistov
  • 2,104
  • 16
  • 25