2

I run KVM based virtualization server (namely, Proxmox) where some Debian based machines are run in KVM VM's. Proxmox can create backups of VMs, it also compress the VM disks images.

As we understand, backup sizes are increasing over time, since more data stored on each VM disk and more 'clean' blocks of VM disk become 'dirty' (that is, contains remaining of old files). So, even if I delete all files on such a virtual disk by rm -rf it, in fact the backup will be of the same size since this won't clear all the blocks of VM disk.

I can 'clear' the VM disk by doing something like dd if=/dev/zero of=/BIG.txt and then rm -f /BIG.txt - this way I create big file full of zeros that will use all of the disk space and after I delete it its ex-blocks will contain zeroes. The downside is that for a moment the disk become full which affects every program that want to write anything.

But maybe there are some other way to clear unused disk blocks with zeroes so backup will compress such a disk with better rate? Some Windows based programs offer options to 'clear unused disk space' (e.g. CCleaner), but I need that for Linux.

Alexander
  • 774
  • 2
  • 11
  • 20
  • `dd` has `bs` and `count` parameters. As a quick workaround you may fill not the whole disk but its larger part. – HUB Jun 30 '16 at 11:42
  • I do know that, but that kind of workaround, not a solution. I would like to add cleaner to the cron and I am not sure if the dd trick is good for such unattended usage. – Alexander Jun 30 '16 at 11:46

1 Answers1

2

Recent libvirt/kvm versions support the discard vdisc option (for SCSI vdisk type only). With this option enabled, you can issue fstrim / on the guest fileststem and unused blocks will be immediately discarded by the host vm image, compacting/reducing it via hole punching.

See here (driver section, search for 'discard') and here for more information.

If you can't use the trim/discard method, you can continue using your current zeroing method (dd from /dev/zero), with a twist: issue two dd passes, each with only little more than half the free disk space, spaced out by a fsync; rm BIG.txt command. This should be enough to recover almost all the free space, without filling it all at once.

shodanshok
  • 47,711
  • 7
  • 111
  • 180
  • Just in a case: Proxmox offer two SCSI controller types: `virtio-scsi` and `virtio-scsi (single)`, do you know what's the difference between them? And also I really wonder if I should use raw disk format or I can use qcow2 (as I do now) to use that `trim` trick? – Alexander Jun 30 '16 at 13:28
  • I don't know what is the difference between the two SCSI controller type, sorry. I tried the `discard` magic with RAW disk file only, so I don't know if it works with QCOW2 files. You had to try by yourself ;) – shodanshok Jun 30 '16 at 13:45
  • Exactly what I am doing now ) – Alexander Jun 30 '16 at 14:16
  • Yes, it works! As I set my virtual disks to have `discard=on`, set disk type to `scsi` and conroller to `virtio-scsi`, trim starts to do its magic. Disks are qcow2. THANK YOU! – Alexander Jun 30 '16 at 14:32
  • Just in a case: I do enable `discard` for virtual disk inself, but do I really need to add `discard` as mount option for partition(s) on such a virtual disk inside the VM? My idea is: OS in VM will then "inform" underling disk which blocks are discarded, then underling virtual disk will discard blocks on physical underling HDD or SSD as well, right? Note that I don't care for HDD discard, just for clearing my virtual disk unused space. – Alexander Jul 01 '16 at 08:59
  • To instruct the guest OS to generate trim/discard requests, you have two choices: a) mount the guest filesystem with the `discard` option or b) schedule a periodic discard via the `fstrim` command. The first option is generally avoided: it really is not good practice to enable `discard` via mount options, as the filesystem will generate many small discards for each remove operation, which is sub-optimal. Instead, the scheduled `fstrim` is the better approach: you can run it in low-usage hours and it tryes hard to generate few big discards rather than many small ones. – shodanshok Jul 01 '16 at 16:50