3

I'm looking for a way to implement something close to the following backup scheme:

  1. Initially, a full image is copied to the backup target.
  2. Periodically (e.g. nightly), only blocks that have changed since the last backup are copied to the backup target.
  3. Ideally, it should be possible to mount snapshots from any point in time, or delete (flatten) some snapshots selectively.

Can this be implemented using LVM (or some other way)? It needs tracking which blocks have become dirty since the last backup, which I'm not sure LVM can do... I'd rather avoid the permanent performance cost of running on an LVM snapshot at all times.

Vladimir Panteleev
  • 1,737
  • 5
  • 20
  • 34
  • Is there a reason why you're doing this using LVM snapshots? To me this sounds like something that might be more suitable for taking backups from files instead. –  Apr 01 '15 at 09:16
  • 1
    This is a large volume with a lot of files, so scanning the entire filesystem for changed files is going to take a lot of time. Plus, I don't know any file-based backup solutions that allow mounting snapshots like LVM does. – Vladimir Panteleev Apr 01 '15 at 16:07
  • What OS - distribution/version are you looking to protect? –  Apr 02 '15 at 12:52
  • Recent Linux kernel. Userspace shouldn't matter. – Vladimir Panteleev Apr 02 '15 at 20:34
  • 1
    Bear in mind that using LVM snapshots can reduce performance. – neutrinus Jul 15 '15 at 20:26

3 Answers3

5

A newcomer to the scene is Attic https://attic-backup.org/

We used rdiff-backup for a few years as our primary backup method. It was great for what it did, but created tens/hundreds of thousands of small diff files across the course of a year. Most file systems and disks are going to struggle to deal with a million-plus file count. Backing up our 90GB Maildir-based IMAP store would take a few hours. I had to constantly lower the number of weeks/diffs that we would keep for history.

In comparison, once we switched to Attic, nightly backups ran in only 15-20 minutes. That means it's much more viable to keep a year's worth of incremental backups to let you go back to any day within the past year.

Main features that drew me to Attic:

  • It doesn't create thousands of files on the destination server
  • Deduplication using variable block sizes
  • Has built-in compression
  • Effective at backing up virtual machine image files
  • Efficient over WAN connections

After using it for 6-9 months, I'm fairly confident that it's as stable as rdiff-backup. I still do a multi-generation copy of the Attic directories using removable media, but each removable media has a full copy of the Attic repository.

tgharold
  • 609
  • 8
  • 19
2

Rsync / Rsnapshot are way better tools for this kind of work, especially considering that they give you a "live" snapshot directory where inconsistencies will be limited to some file at most, but they can't bring down the entire backup. Moreover, using hard links, you can have an incremental backup without the inconveniences associated to it.

I used this solution in production system with million of files and tens of snapshots, with great satisfaction.

shodanshok
  • 47,711
  • 7
  • 111
  • 180
  • Uhh, how long did a backup run take? A filesystem scan is going to be WAY WAY less optimal than a block-level diff. I can't imagine the impact on the inode cache. Plus hard links mean that you can't do block-level dedup, e.g. with VM images. I'm looking into BTRFS snapshot+send at the moment, I think it might be able to do this nicely. – Vladimir Panteleev Apr 02 '15 at 20:34
  • 1
    The inode scan time is under 15 minutes, generally. For backing up "binary blob" files (as VMs images), you are right: on classical file system, you will end with no de duplication. But if you backup the files inside your VMs, this approach works very well. Another possibility is to use a cow file system (btrfs, zfs) + clone() call + rsync --in place, which all rewrite only changed books. – shodanshok Apr 02 '15 at 21:20
2

An alternative to LVM snapshot is to use the datto block driver (aka. dattobd).

From the dattobd GitHub page:

The Datto Block Driver (dattobd) solves the above problems and brings functionality similar to VSS on Windows to a broad range of Linux kernels. Dattobd is an open source Linux kernel module for point-in-time live snapshotting. Dattobd can be loaded onto a running Linux machine (without a reboot) and creates a COW file on the original volume representing any block device at the instant the snapshot is taken. After the first snapshot, the driver tracks incremental changes to the block device and therefore can be used to efficiently update existing backups by copying only the blocks that have changed. Dattobd is a true live-snapshotting system that will leave your root volume running and available, without requiring a reboot.

I tried it and it works as expected on ext4 fs. There is also a working example (with scripts) given in the wiki.

Finally, note that UrBackup has built-in support for snapshot backups on Linux using either LVM or dattobd.

fuujuhi
  • 21
  • 1