4

I have 2 remote folder containing few files. I am using rsync to sync the 2. The thing is that on the first server files are strictly just APPENDED with new data every day.

When I call rsync it seems to me that it copies the entire file again.

Is my call below correct?

  rsync -rtvu src_fld/  user@myserver:/opt/dst_fldr/

My understanding is that rsync is able to compute the difference between the 2 files so I was expecting a very quick update.

1 Answers1

3

rsync is doing many things - in your case, it likely would be building file lists on both sides, comparing them, finding the files to transfer, reading the files on both sides, calculating rolling checksums on both sides, exchanging the checksum information and transmitting differing blocks. This process is going to take time, especially if you have large files (Gigabyte-scale) or a large number of files (magnitude of hundreds of thousands). Due to the significant computational and I/O overhead on the sender and the receiver, it not necessarily would speed up a transmission, it just is likely to reduce the amount of data to be transferred over the link.

If in your case the only file changes are appends (as would be the case with growing logs), consider using the --append-verify option to rsync which would skip the entire computationally intensive rolling checksum calculation and just transfer the tail of the larger file to fill up the smaller one. It also would verify after the transfer if the files on the sender and receiver are identical by running a single-file checksumming operation.

the-wabbit
  • 40,737
  • 13
  • 111
  • 174
  • That's great explanation about how it works. Thanks a lot! – Abruzzo Forte e Gentile Jul 04 '13 at 10:06
  • append-verify does not appear to take any less time either... i have a 40GB file I am trying to transfer off a sick drive which periodically hangs and must be reset, at a much short time interval than it would take to copy the whole file off. when I restart "--inplace --partial --append-verify --progress" it reports ETA no less than the full transfer was estimated to take, despite half the file already having been transferred. I don't see anywhere data is going, so it must be trying to read the whole file back off the drive to check it before sending... – Michael May 21 '16 at 07:49