0

My bash scripting knowledge is very weak that's why I'm asking help here. What is the most effective bash script according to performance to find and copy files from one LINUX server to another using specifications described below.

I need to get a bash script which finds only new files created in server A in directories with name "Z" between interval from 0 to 10 minutes ago. Then transfer them to server B. I think it can be done by formatting a query and executing it for each founded new file "scp /X/Y.../Z/file root@hostname:/X/Y.../Z/" If script finds no such remote path on server B it will continue copying second file which directory exists. File should be copied with permissions, group, owner and creation time.

X/Y... are various directories path. I want setup a cron job to execute this script every 10 minutes. So the performance is very important in this case.

Thank you.

driftux
  • 1
  • 3

3 Answers3

2

rsync may be suitable for your needs. Check rsync before you script a poor copy of it. Otherwise the find command can be used to find files based on name and age and then run the scp command on those it finds.

ramruma
  • 2,740
  • 1
  • 15
  • 8
0

for d in $(find . -type d -name Z); do find $d -maxdepth 1 -type f -amin -10 -print0 | rsync -av --files-from=- --from0 ./ root@hostname:; done

quanta
  • 51,413
  • 19
  • 159
  • 217
  • Thank you for script, but I login to my server A, then upload new file with name "testfile" in one of Z directories (i have them hundreds on server). Then tried to execute command from shell (standding in /root directory) "find . -type f -amin -10 -name Z" but don't get any result :( I don't know if it could help but all Z directories are located in similar paths which (*-means different user name) have structure /home/*/personal/ – driftux Sep 18 '12 at 18:35
  • @driftux: Sorry, I read your question too fast. Updated my answer. – quanta Sep 19 '12 at 02:31
  • Thank you Quanta, I would like to vote, but the system isn't letting me to do so. I didn't try Rsync part, but finding necessary files work well!!! Thank you very much! I'm very apriciate that. –  Sep 19 '12 at 08:26
  • Quanta, is it possible to improve this script a little bit more? Can script do not look at system hidden files and directories. And If script finds files in Z directory it will add prefix "old_" to each file. –  Sep 19 '12 at 08:33
-1

rsync is your best bet and will be the most optimized. Specifically, you want to make sure you preserve times by using the -t option, but the -a option (archive, which includes -t) is usually the best place to start. (If you're on Mac OS X, it's best to also include the -E option to preserve extended attributes & resource forks.) Read the man page, perform all your testing with the -n/--dry-run option so that you're not actually committing any changes to disk, and pay attention to how you use or omit trailing slashes on directory names as they change rsync's behavior.

Unfortunately, find will not work as the implementation on most platforms can only compare times with the granularity of a "day" (rounded, unfortunately), so you won't be able to find files created within the last 10 minutes.

So, a number of rsync cron jobs or a script which performs all the necessary rsyncs is your best bet. If you have tons of small files you'll be synchronizing, 10 minutes may prove to be too short a time period, but it all depends on your specific data sets, so do some testing. Naturally, the initial sync will take longer, so perform it manually before scheduling your cron jobs.

morgant
  • 1,470
  • 6
  • 23
  • 33