17

When using rsync sometimes the rsync doesn't copy all the files done, below is my code I use. Is they a way to do a checksum or check after the rsync to see if all the files have been copied and if not try again until all files have been copied?

TEMP="/home/user/temp"
OPTS="-rav -h"

rsync $OPTS --stats user@example.com:/home/user/Local $TEMP
Oded
  • 489,969
  • 99
  • 883
  • 1,009
Grimlockz
  • 2,541
  • 7
  • 31
  • 38
  • Does anyone know how to produce the situation when "`rsync` doesn't copy all the files"? – Brian Nov 14 '19 at 06:45

3 Answers3

28

As hinted at by uʍop ǝpısdn's answer, rsync -c or rsync --checksum may do what you need.

-c, --checksum: skip based on checksum, not mod-time & size

This forces the sender to checksum every regular file using a 128-bit MD4 checksum. It does this during the initial file-system scan as it builds the list of all available files. The receiver then checksums its version of each file (if it exists and it has the same size as its sender-side counterpart) in order to decide which files need to be updated: files with either a changed size or a changed checksum are selected for transfer. Since this whole-file checksumming of all files on both sides of the connection occurs in addition to the automatic checksum verifications that occur during a file's transfer, this option can be quite slow.

Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking its whole-file checksum, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check.

The concerns about this being slow are probably not relevant these days, and this seems to be a good option when you can't or don't want to rely on modification times.

Community
  • 1
  • 1
Tom
  • 1,557
  • 15
  • 19
  • 4
    Useful when working in Git and switching between branches with changed files, which keeps changing the update times on files that you don't intend to send from a particular branch. – OCDev Apr 20 '15 at 10:27
5

I think this is best solved by configuring rsync properly. Read the man page :) there's options (like --checksum for this).

You can do this on your own as well:

  1. find all files in the rsync'd directory.
  2. xargs md5sum to get a checksum for all files
  3. md5sum the checksums

If you do that on both sides (local/remote), you'll have a a checksum to compare against.

salezica
  • 74,081
  • 25
  • 105
  • 166
  • Thanks for the quick reply - I'm reading the man page for rsync now - to be fair we have a lot of options here - is it checksum I should be looking at? – Grimlockz Apr 03 '13 at 13:03
  • Why is `rsync` not copying those files in the first place? It should. I'm not withholding information here, I don't remember the flags either – salezica Apr 03 '13 at 13:30
  • Rsync is copying the files just not always on the first attempt (might be down to network issues,etc) but I would like to re-run the command with checksum to verify things – Grimlockz Apr 03 '13 at 13:32
  • (You can do the checksum as I described above, `find | md5sum`). Or [see here](http://blog.iangreenleaf.com/2009/03/rsync-and-retrying-until-we-get-it.html) – salezica Apr 03 '13 at 13:39
3

Use rsync -Pahn --checksum /path/to/source /path/to/destination | sed '/\/$/d' | tee migration.txt

sed removes directories from the checksum verification. tee outputs to the screen and to the file at the same time.

Keep in mind that this might not be a suitable method if you have very large files, as the verification will take a long time.

Source

Gaia
  • 2,872
  • 1
  • 41
  • 59