15

Is rsync is a good choice for my project ?

I have to :
- copy files from source to destination folder via SSH,
- be sure all files are copied,
- delete source files after copy.
- if I have conflict name, I have to rename files.

It looks like I can use option : --remove-source-files (to delete source files)
But how rsync manage conflict, can I had rules ?

Use case on my project :

I run scientific calculation on server A and results are inserted in folder "process", for each calculation I have a repository like this : /process/calc1.
Now I would like to transfer repository "/calc1" to server B (I get /process/calc1), and delete "calc1" from server A.
...During another calculation I get "/process/calc2" on server A, the idea is also to move "calc2" in "/process/" directory on server B, then I have now on server B :
- /process/calc1
- /process/calc2
(and /process/ on server A is empty).

How rsync will manage conflict (on server B) if I have another folder like "/process/calc1" in server A after a new calculation (if "/process/calc1" already exist on server B) ?

Is it possible to add rules with rsync, and rename "/process/calc1" by "process/calc1R2" in server B ? And so on (ex:calc1R3) ?

Thanks.

user44782
  • 265
  • 1
  • 3
  • 5

3 Answers3

11

If you really want to use rsync, it sounds like you'll need some combination of --backup, --backup-dir, and --suffix. The closest I think you could get is with something like this

rsync -abv --suffix R1 --remove-source-files src/ dst/

This would do close to what you want, but it would not rename the files exactly the way you'd want. The --suffix option appends text to the end of an existing file, but it only does this for the first conflict. If you ran it again, it would just overwrite your first backup. You'd have to change that suffix value each time the command ran, which would work if you used something with a timestamp, such as this:

rsync -abv --suffix `date +%Y%m%d%k%M%S` --remove-source-files src/ dst/

I'm not sure if this is overkill for what you're after, but it should meet your requirements.

Paul Kroon
  • 2,250
  • 1
  • 16
  • 20
  • I can have big files after calculation, so it's probably better to use rsync (in case of networking trouble). – user44782 Jun 03 '10 at 16:59
1

As the name implies, rsync is used for synchronizing files. When "synced", this means that the files on the source and destination are the same. That does not seem like what you want to do.

It seems like you just want to move some files. You don't need to use rsync for that. It seems like you are using a linux or BSD. You could use mv -n over ssh. The -n option does not overwrite existing files. This is not 100% automatic. However, I don't see how the file could already exist in your case. The files will be copied from the source to the destination and then removed from the source. Do you want to run the same calculations again? Is that why you will end up with files with the same name? I'd suggest appending a run or batch number to the folder name. You'd want that to be clear anyway. Do you have any control over how the folder is named? Any more details? I'd recommend putting the commands in a bash script or similar.

d-_-b
  • 1,124
  • 3
  • 11
  • 23
  • In some case, I have to run the same calculation again (and that why I end up with files with same name). You right, appending a run is a good idea : in find also the command mmv -a (I hope it's work in ssh mode, somebody already use this command line ?). I have control over how folder is named. – user44782 Jun 03 '10 at 16:24
  • It's most likely not installed on the machine. If it is your machine, you could install it though. Debian based: sudo apt-get install mmv You could also looking into sshfs or NFS and have files output directly to the final destination. Unless there is a need for the intermediate location (inspection, modification, etc). If the link between the two machines is unreliable, that is a bad idea of course. Paul's suggestion is good. Though, appending the run number on the outset would probably be more organized. – d-_-b Jun 03 '10 at 21:55
  • 1
    How do you use "mv -n" over ssh? You can use mv over sshfs. – guettli Aug 21 '12 at 13:26
  • Ops... Yes, that makes it clearer. – d-_-b Aug 22 '12 at 00:17
-1

For SSH, in summary, use this:

Access via remote shell:

Pull: rsync [OPTION...] [USER@]HOST:SRC... [DEST]

Push: rsync [OPTION...] SRC... [USER@]HOST:DEST

It's all explained in rsync(1).

As for scripting it for a cronjob, to rsync over ssh automatically without requiring a password, look into ssh-agent(1) too.