0

I am trying to implement custom function for backup using rsync. For this, I modified the following exist code https://linuxconfig.org/how-to-create-incremental-backups-using-rsync-on-linux as follows:

#!/bin/zsh
#https://linuxconfig.org/how-to-create-incremental-backups-using-rsync-on-linux

# A script to perform incremental backups using rsync

set -o errexit
set -o nounset
set -o pipefail

incremental_bckp_rsync_dir(){
  readonly SOURCE_DIR="/$1"
  readonly BACKUP_DIR="/$2/$1"
  readonly DATETIME="$(date '+%Y-%m-%d_%H:%M:%S')"
  readonly BACKUP_PATH="${BACKUP_DIR}/${DATETIME}"
  readonly LATEST_LINK="${BACKUP_DIR}/latest"
  
  mkdir -p "${BACKUP_DIR}"
  
  rsync -av --delete \
    "${SOURCE_DIR}/" \
    --link-dest "${LATEST_LINK}" \
    --exclude=".cache" \
    "${BACKUP_PATH}"
  
  rm -rf "${LATEST_LINK}"
  ln -s "${BACKUP_PATH}" "${LATEST_LINK}"
}


incremental_bckp_rsync_dir path/to/dir path/to/backup

It does successfully a backup of the dir, however the size of the backup dir (obtained with the command du -h path/to/backup) seems to double every time I run the script (which means that it is not incremental, from what I understand. Is there a way to fix it?

ecjb
  • 5,169
  • 12
  • 43
  • 79
  • @jhnc. the size of the backup dir is measured with `du -h path/to/backup` – ecjb Apr 10 '23 at 14:55
  • does your filesystem support hard links? – jhnc Apr 10 '23 at 17:46
  • @jhnc. I think but I am not sure. How can I know if it does? – ecjb Apr 11 '23 at 04:04
  • eg. `find "$LATEST_LINK/" -type f -links +1 -ls | head` – jhnc Apr 11 '23 at 12:31
  • Is this MacOS? Does its `du` default to using `-l` maybe? – jhnc Apr 11 '23 at 12:34
  • `cd "$BACKUP_PATH"; mkdir test; cd test; dd if=/dev/zero bs=10M count=1 > f1; du -h .; ln f1 f2; du -h .; cp f2 f3; du -h .; ls -l` – jhnc Apr 11 '23 at 12:40
  • so @jhnc for `find "$LATEST_LINK/" -type f -links +1 -ls | head` I get `9288501280 8 -rw-r--r-- 7 User01 staff 7 Apr 11 18:34 path/to/dir/latest//backedupfile.txt` – ecjb Apr 11 '23 at 16:40
  • `"$BACKUP_PATH"; mkdir test; cd test; dd if=/dev/zero bs=10M count=1 > f1; du -h .; ln f1 f2; du -h .; cp f2 f3; du -h .; ls -l` gives me `dd: bs: illegal numeric value` – ecjb Apr 11 '23 at 16:42
  • Well, you definitely have hard-links (`... 8 rw-r--r-- ...`). The `dd` command is just trying to create a large file, then hard-linking it (and checking if `du` changes - it shouldn't), and then copying (and checking if `du` changes - it should). `dd if=/dev/zero bs=200b count=100` might work, or just copy in a biggish file from somewhere. – jhnc Apr 11 '23 at 17:17
  • I'm suspicious that it is `du` that is reporting unexpected values, rather than that `rsync` is not correctly hard-linking. You could try `df` instead, and see if it also reports unexpected increase. – jhnc Apr 11 '23 at 17:19

2 Answers2

0

Exists a tool that use rsync and organize backup folder into catalog for easy future use... The tool is Butterfly Backup. I find this article (and also its documentation) that explain how works: https://fedoramagazine.org/butterfly-backup/

Docs: https://github.com/MatteoGuadrini/Butterfly-Backup

Simple use:

bb backup --computer pc1 --destination /nas/mybackup --data User Config --type MacOS --mode Full

And catalog is:

bb list --catalog /nas/mybackup
...
BUTTERFLY BACKUP CATALOG

Backup id: f65e5afe-9734-11e8-b0bb-005056a664e0
Hostname or ip: pc1
Timestamp: 2018-08-03 17:50:36

Backup id: 4f2b5f6e-9939-11e8-9ab6-005056a664e0
Hostname or ip: pc1
Timestamp: 2018-08-06 07:26:46

Backup id: cc6e2744-9944-11e8-b82a-005056a664e0
Hostname or ip: pc1
Timestamp: 2018-08-06 08:49:00
0

I've never tried this but you could use the -n | --dry-run option to have rsync just list all files it would transfer (so all changed, deleted and new files), e.g.:

rsync -an --delete --out-format="%f" src_path/ dst_path

and store it's output to a file and then start another rsync with the --files-from option to feed it with this list to only transfer these files to a different destination (with timestamp in it's path etc).

Another approach (I mostly use) is to have rsync create backup (mirror) and store a copy of modified/deleted files in a seperate path with the -b | --backup option. E.g.

TIMESTAMP=$(date '+%Y-%m-%d_%H%M%S')
rsync -av --delete --backup --backup-dir=$DEST/DELETED/$TIMESTAMP" $SRC/ $DEST

Which will keep your backup in sync but also create an extra backup of old files in a seperate folder under ./DELETED/ at each run on your destination. Usually you want to keep only backups from the last X days so you have to create a delete-job what purges older folders under $DEST/DELETED/.

Peter Kuilboer
  • 249
  • 1
  • 5