0

I copied and re-sorted nearly 1TB of files on a Drobo using a find . -name \*.%ext% -print0 | xargs -I{} -0 cp -v {} %File%s command. I need to make sure all the files copied correctly. This is what I have so far:

#!/bin/sh
find . -type f -exec basename {} \; > Filelist.txt
sort -o Filelist.txt Filelist.txt
uniq -c Filelist.txt Duplist.txt

I need to find a way get the checksum for each file as well as making sure all of them are duplicated. The source folder is in the same directory as the copies, it is arranged as follows:

_Stock
  _Audio
  _CG
  _Images
  _Source (the files in all other folders come from here)
  _Videos

I'm working on OSX.

Community
  • 1
  • 1
VorTechnix
  • 146
  • 6
  • the only way I can think of may get really messy, have you looked into if there is already built application that can do this? – jgr208 Jul 10 '17 at 14:17
  • 1
    that's a scary looking command. Why not re-do the whole thing using `rsync` - test just one folder until you get the right settings and you can see rsync skip over the existing ones, and then it should (your infrastructure dependent) fly right through the list of files, skipping the ones that were already copied over. – hmedia1 Jul 10 '17 at 14:59
  • 1
    i.e. `find ./ -name \*.%ext% -print0 | rsync -avvhP --files-from=- --from0 ./ /destination` - See topics like this: https://unix.stackexchange.com/questions/87018/find-and-rsync – hmedia1 Jul 10 '17 at 15:03
  • @hmedia1 Thanks for the ' rsync ' idea but i'm trying to re-sort the files not back them up. I was able to clean up the code a little by using ' -exec basename {} \; ' but I'm still trying to figure out how to checksum. – VorTechnix Jul 10 '17 at 15:57
  • @jgr208 My big problem is I'm working on OSX and a tight budget so any programs that were even remotely close to what I need are too expensive, but hey if you know one that isn't please let me know. – VorTechnix Jul 10 '17 at 15:58
  • @hmedia1 I looked a little deeper into `rsync` and I think it might work. Thank you. Now here's to hoping... – VorTechnix Jul 10 '17 at 16:36
  • https://sourceforge.net/projects/crcsum/ is one that comes to mind – jgr208 Jul 10 '17 at 16:55
  • @jgr208 Thanks man, that's a big help. – VorTechnix Jul 10 '17 at 23:19
  • With the help of a friend i built a `find` argument to complete the task: `find . \( ! -regex '.*/\..*' \) -type f -exec shasum {} \; -exec basename {} \; | cut -c -40 | sed 'N;s/\n/ /' > Filelist.txt`. We will see if it works but I think it might be the answer. – VorTechnix Jul 12 '17 at 18:17
  • Thank you all for your help. I was able to build a program to do the operation better next time. Final code is in the answer. – VorTechnix Jul 21 '17 at 15:43

1 Answers1

0
#!/bin/sh
find . \( ! -regex '.*/\..*' \) -type f -exec shasum {} \; -exec basename {} \; | cut -c -40 | sed 'N;s/\n/ /' > Filelist.txt
sort -o Filelist.txt Filelist.txt
uniq -c Filelist.txt Duplist.txt
sort -o Duplist.txt Duplist.txt

The regex expression removes hidden files, the shasum and basename arguments create two separate outputs in the text file so we | to cut and then sed to merge the outputs so that the sort and uniq commands can parse them. The script is messy but it got the job done quite nicely.

VorTechnix
  • 146
  • 6