ZFS and rsnapshot for backup

Question

I currently run an OwnCloud server with about 25 accounts, 2.6 TB, and moderately growing. As data will be stored for the next several decades, the OwnCloud data is on a mirrored ZFS file system, to preserve data integrity. I use rsnapshot to retain nightly, weekly, and monthly snapshots on an 8 TB drive (ext filesystem), which is periodically swapped with another 8 TB drive kept off-site.

The simplicity of attaching the 8 TB drive to any linux box is appealing for file or system recovery. This has been working well for 15 months. Have not yet needed to restore from backup, but 2 failing drive were swapped out on the ZFS.

Is there a significant advantage in using ZFS snapshots and/or using ZFS on the backup drives for improved file integrity? What would be “best practice” or should my current system suffice for now and the future?

I suppose you could use zfs send and recv to speed up your offsite backups. Probably significantly. — Michael Hampton, Jan 26 '19 at 03:14
I was hoping for a simple solution without having to read all the docs on send/recv and snapshot management. But I see educating myself may be worthwhile to get a better backup strategy and probably a reduced recovery time. — andrew512, Jan 28 '19 at 15:04

score 1 · Answer 1 · answered Jan 28 '19 at 17:26

ZFS send/recv is "change-aware": this means that only changed block are transferred on subsequent backups. Compared to something as rsnapshot, which needs to walk all metadata to discover any potentially changed file, and then needs to read all modified files to extract any changes, send/recv clearly is way faster. Rather than reinventing the wheel, I suggest you to give a look at syncoid to schedule regular, incremental backups.

That said, rsnapshot is a wonderful piece of software which I extensively use when send/recv is not applicable (ie: when destination runs on something different than ZFS and/or I need to attach it to non-ZFS capable systems).

Brian Thomas · Answer 2 · 2022-11-24T20:18:43.583

I am using rsnapshot to keep the hourly/daily, then delegated off the monthly to using zfs snapshots instead. This is accomplished using a cron job to have zfs just snapshot the dataset weekly.

I find signifigant advantage over the fact that the weekly's are wrangled under control by zfs, the data is readily accessible, is under the integrity of ZFS, and that I can reduce the intervals kept with Rsnapshot.

I can show you what I have setup.

 # weekly; run before monthly, runs 2:35 friday.  Make sure this
 #  runs after rsnapshot (particularly the hourly, which is
 #  frequently running).  The benefit using zfs rsnapshots is you can
 #  list the snapshots to see the filespace difference by each weekly,
 #  and additionally, you can now easily remove the 0B weeklys as
 #  clutter.
 2 * * 5 /bin/nice -17 /usr/local/sbin/zfs-rsnapshot

 # WEEKLY; runs 4:03 mondays
 03 04 * * 1   /bin/nice -17 /bin/rsnapshot weekly
 
 # DAILY; runs 5:03 daily
 03 05 * * *   /bin/nice -17 /bin/rsnapshot daily
 
 # HOURLY; run sync first, runs 03 mins after each hour (6am, 12pm, 18, and midnight) (sync has taken up to 45 mins to complete thus far, there may
 #       have been other proceses running at the time.  And the hourly copy took just under 10 minutes.  Total was about 52m runtime.
 03 06,12,18,00 * * *    /bin/nice -17 /bin/rsnapshot sync && /bin/nice -17 /bin/rsnapshot hourly

The problem im running into is the data sets seem to be growing (by 3,4,5GB each time) for some reason on the snapshots and im having a hard time figuring out why. Assuming im reading that right.

zfs list
NAME                                    USED  AVAIL     REFER  MOUNTPOINT  
...
nas/live/rsnapshot                      279G  1.16T     84.2G  /nas/live/rsnapshot

~$ zfs list -o space
NAME                                   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
nas                                    1.16T  12.0T        0B   59.6K             0B      12.0T
nas/live                               1.16T   775G     45.9G   72.0K             0B       729G
nas/live/rsnapshot                     1.16T   279G      195G   84.2G             0B         0B

zfs list -t snap | grep rsnap
NAME                                                                                     USED  AVAIL     REFER  MOUNTPOINT
nas/live/rsnapshot@rsnap-weekly-2021-1001                                                449M      -     96.6G  -
nas/live/rsnapshot@rsnap-weekly-2021-1008                                                171K      -     96.6G  -
nas/live/rsnapshot@rsnap-weekly-2021-1009                                                171K      -     96.6G  -
nas/live/rsnapshot@rsnap-weekly-2021-1015                                               3.57G      -     93.0G  -
nas/live/rsnapshot@rsnap-weekly-2021-1022                                               5.01G      -     96.3G  -
nas/live/rsnapshot@rsnap-weekly-2021-1029                                               4.27G      -     96.0G  -
nas/live/rsnapshot@rsnap-weekly-2021-1105                                               4.55G      -     96.8G  -
nas/live/rsnapshot@rsnap-weekly-2021-1111                                                590M      -     97.5G  -
nas/live/rsnapshot@rsnap-weekly-2021-1112                                                712M      -     97.6G  -
nas/live/rsnapshot@rsnap-weekly-2022-0401                                               3.95G      -     95.6G  -
nas/live/rsnapshot@rsnap-weekly-2022-0408                                               2.92G      -     95.6G  -
nas/live/rsnapshot@rsnap-weekly-2022-0415                                               5.02G      -     95.8G  -
nas/live/rsnapshot@rsnap-weekly-2022-0422                                               4.26G      -     95.9G  -
nas/live/rsnapshot@rsnap-weekly-2022-0429                                               2.29G      -     96.1G  -
nas/live/rsnapshot@rsnap-weekly-2022-0506                                               2.26G      -     96.5G  -
nas/live/rsnapshot@rsnap-weekly-2022-0513                                               2.23G      -     96.3G  -
nas/live/rsnapshot@rsnap-weekly-2022-0520                                               3.09G      -     96.1G  -
nas/live/rsnapshot@rsnap-weekly-2022-0527                                               4.67G      -      103G  -
nas/live/rsnapshot@rsnap-weekly-2022-0603                                               4.45G      -      102G  -
nas/live/rsnapshot@rsnap-weekly-2022-0610                                               4.26G      -      116G  -
nas/live/rsnapshot@rsnap-weekly-2022-0617                                               3.94G      -      118G  -
nas/live/rsnapshot@rsnap-weekly-2022-0624                                               4.40G      -     84.4G  -
nas/live/rsnapshot@rsnap-weekly-2022-0701                                               3.08G      -     84.4G  -
nas/live/rsnapshot@rsnap-weekly-2022-0722                                               2.16G      -     84.2G  -
nas/live/rsnapshot@rsnap-weekly-2022-0729                                               2.97G      -     85.0G  -
nas/live/rsnapshot@rsnap-weekly-2022-0805                                               2.71G      -     85.3G  -
nas/live/rsnapshot@rsnap-weekly-2022-0812                                               2.13G      -     84.4G  -
nas/live/rsnapshot@rsnap-weekly-2022-0819                                               2.76G      -     84.4G  -
nas/live/rsnapshot@rsnap-weekly-2022-0826                                               2.16G      -     83.9G  -
nas/live/rsnapshot@rsnap-weekly-2022-0902                                                790M      -     84.6G  -
nas/live/rsnapshot@2022-1105_before_move_live-to-condor                                  798M      -     83.8G  -
nas/live/rsnapshot@rsnap-weekly-2022-1111                                               3.71G      -     86.1G  -
nas/live/rsnapshot@rsnap-weekly-2022-1118                                               3.72G      -     84.6G  -

zfs-rsnapshot

#!/bin/bash
###
## run cronjob to snapshot the last week of rsnapshot files.
#  Doing this weekly to limit amount of snapshots,
#  therefore will have to keep atleast 7 dailys, and the
#  hourlys, to do this, but if the cronjob doesnt run one
#  week, you will miss out on 1 day of data for every day that
#  the cronjob is out.  e.g. if cron stalls for 1 week, thats
#  7 dailys that will have been erased on the rsnapshot
#  server during that time.  So to protect this data be sure
# to keep the extra days in rsnapshot for anticipated cronjob
#  lag.
##

DATE=`date +%Y-%m%d`

#going weekly, so adding that to the name
SNAPNAME="rsnap-weekly-${DATE}"

#dataset name to snapshot
DATASET="nas/live/rsnapshot"

#run it
/sbin/zfs snapshot -r $DATASET@$SNAPNAME

rsnapshot.conf

# SNAPSHOT ROOT DIRECTORY #
snapshot_root   /nas/live/rsnapshot/
no_create_root  1
cmd_cp          /bin/cp
cmd_rm          /bin/rm
cmd_rsync       /bin/rsync
#cmd_ssh        /bin/ssh
cmd_logger      /bin/logger
cmd_du          /bin/du
cmd_rsnapshot_diff      /bin/rsnapshot-diff
#cmd_preexec    /path/to/preexec/script
#cmd_postexec   /path/to/postexec/script


#                       BACKUP LEVELS / INTERVALS                                       
#NOTE, this one isnt USED automatically, only for manual runs
#retain hourly  6
#retain daily   7
#retain weekly  7
#retain monthly 4

# Incrementing only 4 times daily, be sure to sync first.
interval        hourly  4
#dont need 7th daily, since its in weekly.
interval        daily   6
########
# NEW #
#######
# MAKING CHANGES HERE FOR ZFS RSNAPSHOT script that moves the monthly, quarterly, and annual
#               over to ZFS to omit all the extra redundant rsnapshot copies (oh they were hardlinked
#               anyway werent they, oh well, maybe this will end up being cleaner)
# only need two weeks in total, so thats 1 weekly which combined with the daily covers the
#  two weeks for cronjob redundancy, giving cron a week of error stalling, before data loss
#  of the first rsnapshot daily entrys (one per day gets eaten by rsnapshot, if the cron
#  doesnt get it tO ZFS).  So, only 1 is needed, which is the redundant week of extra data.
interval        weekly  1
# NO MORE ENTRYS ARE NEEDED FOR ZFS RSNAPSHOT
#
#######
# OLD #
#######
## dont need 4th week, since its in monthly.
#interval               weekly  3
##only 5, (to cover 6 months, which 1st quarterly contains the 6th nonth)), then move to quarterly
#interval               monthly 5
##only 3 since yearly contains the 4th
#interval               quarterly               3
## quarterly monthly and quarterly covers first year, add 1 additional year.
#interval               yearly  1

verbose         2
# Log level, isame as verbose, but these get written to the logs
loglevel        3
logfile         /var/log/rsnapshot
lockfile        /var/run/rsnapshot.pid
#stop_on_stale_lockfile         0
#rsync_short_args       -a
#rsync_long_args        --delete --numeric-ids --relative --delete-excluded
#ssh_args       -p 22
#du_args        -csh
#one_fs         0
#include        ???
#exclude        ???
#include_file   /path/to/include/file
#exclude_file   /path/to/exclude/file
#link_dest      0

# using sync_first, (1), so run this exactly before you run your hourly, as a general rule.
sync_first      1

#use_lazy_deletes       0
#rsync_numtries 0
backup_exec     /bin/date "+ backup ended at %c"

I use zfs send to move these offsite as well.

( set -o pipefail && zfs send -Ri @rsnap-weekly-2022-0902 nas/live/rsnapshot@rsnap-weekly-2022-1118 | pv | ssh <ip-of-upstream> zfs recv -Fvs int/live/rsnapshot )

I'm open for areas of improvement, and to know why the backups are incurring space each time. Could this actually be changing logs and stuff on the linux system that im rsnapshotting, or could I be missing something? not sure. Comments encouraged.

ZFS and rsnapshot for backup

2 Answers2