1

I am using Rsnapshot to backup my server files to a remote server. To minimize the file size, I want to tar the backup files, but I don't know how to do it with Rsnapshot.

Can anyone help me with this? Any help would be grateful. Thank You.

technoob
  • 142
  • 1
  • 14
  • 2
    rsnapshot already reduces disk usage by having rsync hardlink anything unchanged to the previous revision. Also, if anything, `tar` alone will *increase* the size of the files, because it needs to add some metadata that normally exists outside of the files. Bottom line, This question doesn't make a lot of sense; **what are you *really* trying to do,** for which you believe what you are describing in your question is the solution? See [What is the XY problem?](http://meta.stackexchange.com/q/66377/157730). – user Apr 07 '15 at 08:52
  • Ok, I will improve my question skill. So, based on your description, there is no need to tar backup files? – technoob Apr 07 '15 at 08:58

1 Answers1

5

You can use the cmd_postexec configuration setting to run a custom script after rsnapshot completes. This script can then do anything you want.

However, tarring the files after the fact will, if anything, increase the disk usage. Probably by a lot. By default, rsnapshot passes options to rsync such that rsync hardlinks any unchanged files to the previous revision of the same file.

  • tar adds some metadata, which normally exists separately (for example, file path/name, permissions, etc.), leading to a marginal increase in disk space usage compared to just keeping the files raw.
  • tar by itself does not do any compression (although you can run the resultant .tar file through a compressor like gzip, bzip2, xz, etc.).
  • By tarring the tree after rsync finishes, you lose the ability to hardlink between revisions, so each backup tree tar file must contain the full content rather than just the difference compared to the previous revision.

The only way around the third point above that I can see would be to tar the entire target directory structure (the one named by snapshot_root in rsnapshot's configuration) but you would then need to untar it before rsnapshot starts running rsync (this can be done through a script executed through cmd_preexec if you really want to do it), then re-tar it afterwards. This will approximately double the peak disk space usage as well as significantly increase the time needed to run through a rsnapshot execution, for no real benefit at all.

Just use rsnapshot the way it was intended. It's easier, and I don't see it having any downsides that would be solved by religiously running tar on the backups.

If you have a specific issue with how rsnapshot keeps the copies, then focus on that, not on tar (which may or may not be a solution to whatever issue you might be having).

user
  • 4,335
  • 4
  • 34
  • 71
  • 1
    +1 for this because it answers the question really nicely - however I do have a reason for using this. I am syncing my entire rsnapshot tree onto my google drive account and it takes _forever_. By creating a tarball (a compressed one in this case) I can do it with a single sync instead of pushing the thousands of files that are involved. (Gdrive is limited to only 6 transfers simultaneously! :/ ) – djsmiley2kStaysInside Sep 26 '16 at 13:00
  • May I ask, what solution did you finally go with? I am doing the same thing right now...just uploading a tarball to gdrive. A more efficient solution would be nice.. – ahron Jul 13 '20 at 15:18