2

We perform nightly backups of a Windows file server by first creating an incremental backup file on the server each night (as well as a complete backup on Thursday night), and then copying that to a backup server running Linux/Ubuntu. In order to maintain off-site redundancy, we then rsync the backup directory to an external drive which is rotated after each nightly run.

Over time, the number of incremental backup files has increased at a steady pace (although, the size of these files has varied).

We've also noticed that the rsync process has steadily been taking longer, even though it is only copying the latest two files to the external drive.

This is the command:

rsync -vr --delete-before --log-file=/rsync_log.csv /backup/archive/ /mnt/usbdisk/  2>&1

When we tested the command, and investigated the log file, we saw this:

...
2011/02/14 23:59:35 [14054] >f..T...... IncrementalBackup_2011_02_06_20.bkf
2011/02/15 00:00:45 [14054] >f..T...... IncrementalBackup_2011_02_08_20.bkf
2011/02/15 00:03:22 [14054] >f..T...... IncrementalBackup_2011_02_09_20.bkf
2011/02/15 00:04:36 [14054] >f..T...... IncrementalBackup_2011_02_11_20.bkf
2011/02/15 00:04:51 [14054] >f..T...... IncrementalBackup_2011_02_12_20.bkf
2011/02/15 00:05:06 [14054] >f..T...... IncrementalBackup_2011_02_13_20.bkf
2011/02/15 00:06:13 [14054] >f+++++++++ IncrementalBackup_2011_02_14_20.bkf
2011/02/15 00:54:32 [14054] >f..T...... Thursday_Full_Backup_2011_01_20.bkf
2011/02/15 03:24:41 [14054] >f..T...... Thursday_Full_Backup_2011_01_27.bkf
...

What we found was the time taken on each file related to the size of the file - even when skipping it (example - the full backup took about 2.5 hours to process, while the incrementals about 2-3 minutes or less).

The only file actually copied is the latest incremental file.

The only explanation we can think of is that rysnc is performing a checksum of the file - even though the documentation says it does not by default, and we have not specified the --checksum switch on the command. Surely it can't take 2.5 hours to determine the timestamp and filesize?

After having gone over the documentation, I can't find any other explanation than the checksum is being calculated. So, is there a way to be sure that checksum is disabled?

HorusKol
  • 751
  • 5
  • 13
  • 31

2 Answers2

1

Maybe the problem is in the timestamps of the files. If backups are created using a windows program it can mess the timestamps. If I replicate your sample with a bunch of pictures I obtain this log

2011/02/15 03:46:46 [61820] delta-transmission disabled for local transfer or --whole-file
2011/02/15 03:46:46 [61820] .d..t....... ./
2011/02/15 03:46:46 [61820] IMG_0055.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0056.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0057.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0058.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0059.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0060.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0061.JPG is uptodate
2011/02/15 03:46:46 [61820] >f..t....... IMG_0062.JPG
2011/02/15 03:46:46 [61820] IMG_0063.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0064.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0065.JPG is uptodate
2011/02/15 03:46:46 [61820] IMG_0066.JPG is uptodate
2011/02/15 03:46:46 [61820] total: matches=0  hash_hits=0  false_alarms=0 data=5911343
2011/02/15 03:46:46 [61820] sent 5912367 bytes  received 67 bytes  11824868.00 bytes/sec
2011/02/15 03:46:46 [61820] total size is 75450221  speedup is 12.76

So the files are tranferred each time otherwise you should have the log message file xxx is uptodate. Try to increase rsync verbosity with more than one v flag to find out what's happening.

Fabio
  • 1,299
  • 2
  • 13
  • 18
  • Spurred me to check the filetimes on the external drive - realised that they are set to the time when the file was copied over, not the local modified time. Further, realised that we were missing __-t__ - d'oh – HorusKol Feb 15 '11 at 03:32
  • Glad to hear that. Anyway when using rsync to backup something I always use -a switch to maintain file metadata. You should really replace your -r and -t with -a which implies both of them plus many others useful flags. – Fabio Feb 15 '11 at 12:02
  • yeah, we looked over the other flags and determined that they weren't all necessary for our requirements - so we sticking with: `rsync -vrt --delete-before --log-file=/rsync_log.csv /backup/archive/ /mnt/usbdisk/ 2>&1` – HorusKol Feb 16 '11 at 01:51
0

What filesystem is the usb drive?

There is a know issue with FAT and timestamps. See the rsync man page:

   --modify-window
          When comparing two timestamps, rsync treats the timestamps as being equal if they  differ  by  no  more
          than  the  modify-window value.  This is normally 0 (for an exact match), but you may find it useful to
          set this to a larger value in some situations.  In particular, when transferring to or from an MS  Win‐
          dows  FAT  filesystem  (which represents times with a 2-second resolution), --modify-window=1 is useful
          (allowing times to differ by up to 1 second).
Steven
  • 3,029
  • 20
  • 18