0

I need to create a clone of a directory tree so I can clean up duplicate files.

I don't need copies of the files, I just need the files, so I want to create a matching tree with hard links.

I threw this together in a couple of minutes when I realized my backup was going to take hours

It just echos the commands which I redirect to a file to examine before I run it.

Of course the usual problems, like files and directories containing quote or commas have not been addressed (bash scripting sucks for this, doesn't it, this and files containing leading dashes)

Isn't there some utility that already does this in a robust fashion?

BASEDIR=$1
DESTDIR=$2
for DIR in `find "$BASEDIR" -type d` 
do
   RELPATH=`echo $DIR | sed "s,$BASEDIR,,"`
   DESTPATH=${DESTDIR}/$RELPATH
   echo mkdir -p \"$DESTPATH\"
done

for FILE in `find "$BASEDIR" -type f` 
do
   RELPATH=`echo $FILE | sed "s,$BASEDIR,,"`
   DESTPATH=${DESTDIR}/$RELPATH
   echo ln \"$FILE\" \"$DESTPATH\"
done
user939857
  • 377
  • 5
  • 19

1 Answers1

1

Generally using find like that is a bad idea - you are basically relying on separating filenames on whitespace, when in fact all forms of whitespace are valid as filenames on most UNIX systems. Find itself has the ability to run single commands on each file found, which is generally a better thing to use. I would suggest doing something like this (I'd use a couple of scripts for this for simplicity, not sure how easy it would be to do it all in one):

main.sh:

BASEDIR="$1" #I tend to quote all variables - good habit to avoid problems with spaces, etc.
DESTDIR="$2"
find "$BASEDIR" -type d -exec ./handle_file.sh \{\} "$BASEDIR" "$DESTDIR" \; # \{\} is replaced with the filename, \; tells find the command is over
find "$BASEDIR" -type f -exec ./handle_file.sh \{\} "$BASEDIR" "$DESTDIR" \;

handle_file.sh:

FILENAME="$1"
BASEDIR="$2"
DESTDIR="$3"
RELPATH="${FILENAME#"$BASEDIR"}" # bash string substitution double quoting, to stop BASEDIR being interpreted as a pattern
DESTPATH="${DESTDIR}/$RELPATH"
if [ -f "$FILENAME" ]; then
  echo ln \""$FILENAME"\" \""$DESTPATH"\"
elif [ -d "$FILENAME" ]; then
  echo mkdir -p \""$DESTPATH"\"
fi

I've tested this with a simple tree with spaces, asterisks, apostrophes and even a carriage return in filenames and it seems to work.

Obviously remove the escaped quotes and the "echo" (but leave the real quotes) to make it work for real.

Muzer
  • 774
  • 4
  • 11
  • Good solution, but for `find`s that support it, `print0` .... |xargs -0 myMvCmd ... ` can handle about any filename. Good luck to all. – shellter Sep 27 '14 at 22:24