17

Is it possible to specify a time range so that rsync only operates on recently changed files.

I'm writing a script to backup recently added files over SSH and rsync seems like an efficient solution. My problem is that my source directories contain a huge backlog of older files which I have no interest in backing up.

The only solution I've come across so far is doing a find with ctime to generate a --files-from file. This works, but I have to deal with some old installations with versions of rsync that don't support --files-from. I'm considering generating --include-from patterns in the same way but would love to find something more elegant.

Ken
  • 77,016
  • 30
  • 84
  • 101
  • After you did your initial rsync next time you call it it will only transfer the new or modified files. That's the purpose of rsync. – lothar Jun 03 '09 at 15:24
  • Another option would be to reorganize your directory layout that the files you don't want to backup are in a different path, so that you can put that path on to the ignore list for rsync. – lothar Jun 03 '09 at 15:26
  • I was going to suggest use of rsync's `-t` option, but that doesn't exactly do what was asked – Hasturkun Jun 03 '09 at 15:28
  • Thanks lothar - but my problem is that there are a huge number of historical files that I'm not interested in (but can't delete since it may be useful to other people). I'm hoping for a solution that will let me completely ignore the old material. – Ken Jun 03 '09 at 15:30
  • @lothar - and I can't delete or rearrange the historical material – Ken Jun 03 '09 at 15:31
  • @Ken Today diskspace is cheap. Once you did your initial backup (including the historical files) rsync will never touch them again. It may just not be wirth the trouble to exclude them. Just my 2c ;-) – lothar Jun 03 '09 at 16:55

4 Answers4

26

It looks like you can specify shell commands in the arguments to rsync (see Remote rsync executes arbitrary shell commands)

so I have been able to successfully limit the files that rsync looks at by using:

rsync -av remote_host:'$(find logs -type f -ctime -1)' local_dir

This looks for any files changed in the last day (-ctime -1) and then rsyncs those into local_dir.

I'm not sure if this feature is by design but I'm still digging into the documentation.

Ken
  • 77,016
  • 30
  • 84
  • 101
  • Just wanted to drop a line that this just came in super useful for a data import script I'm using. Thank you! – Matthew Jun 18 '12 at 15:22
  • 1
    Watch out for this though; if you have a lot of files that match, the results of the embedded find can run into the shell's command-line length limit. Happened to me. – GaryO Apr 28 '15 at 17:27
  • This won't preserve directory structure – carlosvini May 21 '15 at 21:36
2

Why not just take the heat on backing up the whole directory once and take advantage of the incremental backing up provided by rsync and rdiff and its cousins, you won't waste diskspace where they are backed up to because they'll be perpetually unchanged.

Backing up the whole thing is simpler, and has substantially less risk for errors. Trying to selectively backup some files and not others is a recipe for not backing up what you need without realizing it, then getting burned when you can't restore a critical file.

Otherwise you should reorganize your source directory so there is less 'decision making' in your backup script.

whatsisname
  • 5,872
  • 2
  • 20
  • 27
  • I would normally agree about the risk of errors, but I'll never have any use for the older files (logs and other records which will never change). I would just take the heat but the thought of having to download and regularly reprocess several gigabytes of unwanted bloat is what prompted this question in the first place. Reorganisation is probably the solution - I can't change the existing structure but I can set up a temporary directory as Hasturkun suggested. – Ken Jun 03 '09 at 16:07
  • For me, I want to transfer from remote to local, process the files, then delete the old ones (mtime +30) from local to save space. Naive rsync re-downloads the old ones next time because they're now missing on local. – GaryO Apr 28 '15 at 17:28
1

How about creating a temporary directory, symlinking or hardlinking the files in, then rsyncing that?

Hasturkun
  • 35,395
  • 6
  • 71
  • 104
1

May I suggest you drop rsync and look at rdiff-backup?

I GIVE CRAP ANSWERS
  • 18,739
  • 3
  • 42
  • 47
  • Thanks I'll take a look - I looked at it previously but the CIFS compatibility issue put me off. ( http://rdiff-backup.nongnu.org/FAQ.html#cifs ) – Ken Jun 04 '09 at 12:42