2

First off I know this topic easily gets subjective, but I'm trying to avoid that, as there should be at least 1 good answer to this in a sea of bad answers and it's hard to find.

At first sight my question seems simple; How do you store Virtual Machine disks on Harddisks, while making sure data integrity isn't compromised, and performance isn't horrible.

But it's actually harder than it appears;

  • ZFS and BTRFS are no option: Copy On Write filesystems are notoriously poor at handling large files, especially if they may contain another Copy On Write filesystem themselves! You CAN turn off COW on BTRFS, but this also turns off checksumming (and compression, deduplication, etc).
  • EXT2/3/4, XFS, ReiserFS, NTFS, etc etc all do not do full data checksumming like BTRFS/ZFS do and are not and option.

So is it checkmate then? You can't have full data integrity other than simplistic RAID setups that have other issues such as Write Holes(RAID5) and generally very poor handling of corrupt files where it's unclear which of two copies is the correct one. Issues that are avoided with higher level systems that checksum and verify the integrity of files before they're returned to the operating system or user.

The only option I can think of is using BTRFS/ZFS inside the VMs rather than on the host, and schedule snapshots and backups appropriately on each machine, even though that's a lot more cumbersome than doing it on the host.

Does anyone know any other way to achieve my goal?

Alex
  • 389
  • 9
  • 23
  • *ZFS and BTRFS are no option: Copy On Write filesystems are notoriously poor at handling large files* Oh? Why do you think that? – Andrew Henle Feb 06 '17 at 11:45

1 Answers1

5

Let's start from a simple evidence: greater data resiliency and integrity features generally have a performance penalty to pay. From here, we can do some more considerations:

  • ZFS has much better performance that BTRFS when used as VM-backing filesystem, at least on RHEL/CentOS hosts. While it's true that on pure mechanical HDD it remain slower that more traditional filesystems, using even a relatively small SSD as SLOG device will noticeably increase its performance. In other words, VMs on ZFS are a perfectly reasonable use case;

  • even when using traditional filesystems without full data checksumming as XFS and EXT4, the odds of data corruption on an healthy system are very small. BER/UBER/URE rating are often cited out of context and without taking regular scrubs into account;

  • hardware RAID5/6 cards with powerloss protected cache are immune to write hole. Moreover, RAID6 can also be used as a form of data checksumming (note: it depends on the specific controller/implementation). So a RAID6 array with appropiately sized write back cache is a reasonable solution;

  • finally, as suggested, you can use ZFS inside the VM. For such a setup I would export raw LVM volumes to guests, formatting the data container as ZFS. However, I would take snapshots of the LVM volumes themselves, rather than from inside the single guest VMs. For better performance I would use RAID10 on the host (as the base for LVM)

shodanshok
  • 47,711
  • 7
  • 111
  • 180