4

I have a btrf volume on top of LVM. Now I want to do a snapshot on the lvm level (NOT on the btrfs level). But every time I create the LVM snapshot, btrfs changes the mounted block device to the snapshot like I was using some kind of --bind mount option.

Situation:

# mount | grep libvirt
/dev/dm-4 on /var/lib/libvirt/images type btrfs (rw,relatime,space_cache)
# ls -l /dev/mapper | grep dm-4
lrwxrwxrwx 1 root root       7 Mär 17 01:18 system-vm_disks -> ../dm-4
# lvcreate -s /dev/system/vm_disks -n vm_backup -L 32G
  Logical volume "vm_backup" created
# mount | grep libvirt
/dev/dm-5 on /var/lib/libvirt/images type btrfs (rw,relatime,space_cache)
# ls -l /dev/mapper | grep dm-5
lrwxrwxrwx 1 root root       7 Mär 17 01:18 system-vm_backup -> ../dm-5
# mount /dev/system/vm_backup /mnt/test
# touch /mnt/test/touchME
# ls -l /var/lib/libvirt/images/touchME
-rw-r--r-- 1 root root 0 Mär 17 01:26 /var/lib/libvirt/images/touchME

When I remove the snapshot:

# umount /mnt/test
# lvremove /dev/system/vm_backup 
Do you really want to remove active logical volume vm_backup? [y/n]: y
  Logical volume "vm_backup" successfully removed
# mount | grep libvirt
/dev/dm-4 on /var/lib/libvirt/images type btrfs (rw,relatime,space_cache)
# ls -l /dev/mapper | grep dm-4
lrwxrwxrwx 1 root root       7 Mär 17 01:21 system-vm_disks -> ../dm-4
# ls -l /var/lib/libvirt/images/touchME
-rw-r--r-- 1 root root 0 Mär 17 01:26 /var/lib/libvirt/images/touchME

I expect my snapshot to be a real snapshot not something like a --bind mount. I'm using the LVM snapshots to backup a consistent system state via rsync to our backup server. And I don't want to use btrfs snapshots for various reasons:

  • I want to backup every subvolume and every btrfs snapshot inside the vm_disks LV (and I don't know how much and which snapshots/subvolumes exist)
  • My backup strategy should not be filesystem dependant. Ideally it should not be neccessary to change anything else when changing the filesystem at /var/lib/libvirt/images

My System:

# uname -a
Linux laptop 3.12-1-amd64 #1 SMP Debian 3.12.9-1 (2014-02-01) x86_64 GNU/Linux
# lvm version
  LVM version:     2.02.104(2) (2013-11-13)
  Library version: 1.02.83 (2013-11-13)
  Driver version:  4.26.0
# btrfs --version
Btrfs v3.12

I have to use at least kernel 3.9 or newer since I use the IPv6 NAT features provided by Linux 3.9 or newer (yes, I know you should not use NAT with IPv6, but thats an other story).

Thanks for your help!

Edit:

I did some experiments using dd and loop devices. This behavior is not specific to LVM at all.

Tests:

# mkfs.btrfs /dev/loop0
[...]
# mount /dev/loop0 original
# touch original/original_file
# ls -l original
-rw-r--r-- 1 root root 0 Mar 28 21:42 original_file
# dd if=/dev/loop0 of=/dev/loop1
509312+0 records in
509312+0 records out
260767744 bytes (261 MB) copied, 1.76431 s, 148 MB/s
# mount /dev/loop1 clone
# touch clone/expected_clone_file
# ls -l clone
-rw-r--r-- 1 root root 0 Mar 28 21:44 expected_clone_file
-rw-r--r-- 1 root root 0 Mar 28 21:42 original_file
# ls -l original
-rw-r--r-- 1 root root 0 Mar 28 21:44 expected_clone_file
-rw-r--r-- 1 root root 0 Mar 28 21:42 original_file
# umount clone
# umount original
# mount /dev/loop1 clone
# ls -l clone
-rw-r--r-- 1 root root 0 Mar 28 21:42 original_file
# umount clone
# mount /dev/loop0 original
# ls -l original
-rw-r--r-- 1 root root 0 Mar 28 21:44 expected_clone_file
-rw-r--r-- 1 root root 0 Mar 28 21:42 original_file

So whenever you try to mount a new device with a cloned btrfs filesystem inside you end up using the old already mounted device (but nothing in the output of mount is properly indicating this, as you can see in the LVM experiment above). All requests thus end up using the old device. You are not able to mount the cloned fs until you unmount the original fs (and you cannot mount the original fs while the cloned one is mounted).

My question now is: How can I change the uuid of the cloned btrfs to some new unused uuid. After that I would be able to mount the cloned device alongside the original one, I suspect.

Thilo
  • 243
  • 3
  • 11
  • How/why did /dev/dm-5 get mounted and unmounted at /var/lib/libvirt/images? – Zabuzzman Mar 25 '14 at 15:04
  • @Zabuzzman That is my question. I issued only the commands listed in my question, nothing more. But after invoking the lvcreate command the mounted device for /var/lib/libvirt/images changes magically instead of just providing me a new (NOT mounted) device (/dev/dm-5). And lvremove magically changes the mounted device back to dm-4, too. – Thilo Mar 27 '14 at 01:11
  • Out of curiosity I tried the same thing by snapshotting a btrfs on top of lvm and it reproduces the exact same behavior (Fedora 20)... while it doesn't for an ext4 volume. I need some time to figure out why this is normal ;-) I'll get back to you when I do. – Zabuzzman Mar 27 '14 at 23:11
  • @Zabuzzman I'm glad you managed to reproduce it. Now I'm not the only one with vision disorders / magic hands ;) Thanks for your help btw :) – Thilo Mar 28 '14 at 03:04
  • For your loop example, can you provide the losetup commands used? – Zabuzzman Mar 31 '14 at 10:53
  • @Zabuzzman I used no special commands. I created a backend file (ca. 240MiB) named original.img and a copy create clone.img. Then I used losetup like this: losetup /dev/loop0 original.img and losetup /dev/loop1 clone.img – Thilo Apr 01 '14 at 11:19
  • see my answer below. udev is the cause explaining both snapshot as loop-device scenario. – Zabuzzman Apr 01 '14 at 13:14
  • @Zabuzzman Not exactly. I discovered the other codepath to btrfs_scan_one_device(). In the comment of this function it is stated: "This may be called out of the mount path[...]". Seems like you cannot disable this behaviour via udev. Maybe there are some mount options to suppress this?? – Thilo Apr 01 '14 at 13:33
  • don't agree. Something has to call that code... mount-points don't just spontaneously execute a systemcall. udev does. I successfully tested as described below. – Zabuzzman Apr 01 '14 at 15:57
  • No, I meant that you enter this code when you mount the second loop device with the cloned fs inside (having the same UUID as the original). At least this is how I would interprete the comment in the kernel code. – Thilo Apr 01 '14 at 20:27
  • You're right, the mount command calls the same code. Well, as you can seem to easily change UUID's for btrfs this is leading us nowhere. Hey , what about using ext4 then? :) – Zabuzzman Apr 01 '14 at 20:43
  • @Zabuzzman Well...it's sad but you're right. I found some other forum posts stating the UUID is not changeable, too. I used btrfs because I could easyly take snapshots of individual VMs and boot the snapshots etc. Well, now I have to consider what's more important, a clean backup or the ability to snapshot individual VMs. Or I could write an UUID changer for btrfs myself ;) Anyway, hughe thanks for your help!! But I'll mark Matthew Ife's answer as correct because it leads straighter to this UUID issue. – Thilo Apr 01 '14 at 21:54

2 Answers2

2

It seems that udev is causing this behavior.

Performing the lvcreate (or losetup) causes udev "change" actions on the "block" system:

# udevadm monitor
...
UDEV  [62084.032411] change   /devices/virtual/block/dm-7 (block)
UDEV  [62084.469374] change   /devices/virtual/block/dm-6 (block)
UDEV  [62084.582549] change   /devices/virtual/block/dm-6 (block)
UDEV  [62084.606191] change   /devices/virtual/block/dm-5 (block)
...

which triggers (in my case) the rules from

/lib/udev/rules.d/64-btrfs.rules

and calls the builtin udev command:

IMPORT{builtin}="btrfs ready $devnode"

which passes through src/udev/udev-builtin-btrfs.c:52

err = ioctl(fd, BTRFS_IOC_DEVICES_READY, &args);

To land in kernel at: http://lxr.free-electrons.com/source/fs/btrfs/volumes.c#L848 causing a dmesg like:

...
[62030.117248] btrfs: device label label devid 1 transid 13 /dev/dm-6
[62030.141242] btrfs: device label label devid 1 transid 13 /dev/dm-5
...

It is unclear exactly what is causing the "remount" or why it is needed. But the remarks that the duplicate UUID is responsible seem not far fetched.

I'm not even sure that this kind of remount (changing the device of existing mount-point) is wanted or useful behavior...

If you would want to change the behavior you could modify or remove the btrfs-udev rules with loss of functionality: no more auto-mounts after hot-plugging btrfs usb disks.

Zabuzzman
  • 733
  • 10
  • 25
  • Well, I commented out everything in the udev file you mentioned. On my system there are two more udev rule files which call "/sbin/btrfs device scan $env{DEVNAME}" (/lib/udev/rules.d/80-btrfs-lvm.rules and /lib/udev/rules.d/70-btrfs.rules). I commented them out, too and restarted udev. Now I'm getting two "change" events for each loop device I create. When mounting the first one I get one "add /devices/virtual/bdi/btrfs-2 (bdi)". When I mount the second one I get no more udev reactions in udevadm monitor. And the behavior is still the same. The second mount mirrors the first one. – Thilo Apr 01 '14 at 13:20
  • My kernel output on first mount: "btrfs: device fsid b034f327-805c-4399-b694-b9c993640fbd devid 1 transid 16 /dev/loop0". And on second mount: "btrfs: device fsid b034f327-805c-4399-b694-b9c993640fbd devid 1 transid 13 /dev/loop1". Seems like we are entering "btrfs_scan_one_device()" using some other codepath. – Thilo Apr 01 '14 at 13:23
  • As a test, when I removed the `64-btrfs.rules` file and restarted systemd-udevd service the phenomenon stops. Try a `grep btrfs` in `/lib/udev/rules.d/` and remove/backup all hits? – Zabuzzman Apr 01 '14 at 15:52
  • I used exactly this grep to find the two other rules ;) And in /etc/udev/rules.d are no other rules for btrfs, too. Could you mount both loop devices having the same FS UUID inside and use both mounts independently? `touch original/testfile` and `ls -l clone/testfile` behaves like you expect distinct mounts to behave? – Thilo Apr 01 '14 at 20:25
0

I have not looked massively into this but btrfs as a filesystem works on groups of disks, not individual devices.

I suspect that there is no way for btrfs to distinguish between the mounted snapshot and the real mounted filesystem when a mount occurs. It may as a matter of fact see the UUID of the underlying subvolume, assume its a mirror of the original volume and write to both volumes at the same time.

I would be surprised if this ever gets fixed seeing as for most intents and purposes btrfs snapshots supersede LVM snapshots.

Matthew Ife
  • 23,357
  • 3
  • 55
  • 72
  • Well, that would also imply the following behavior (and that would be REALLY bad): You clone a disk via dd or some similar tool. Then you mount bot disks, your original disk and the clone, to modify some files and use the original file as a reference.And instead of the cloned disk, the original disk would be changed! – Thilo Mar 28 '14 at 20:20
  • Well, its irrelevent since you'd use a btrfs snapshot to do the entire thing in a much quicker time. – Matthew Ife Mar 28 '14 at 20:23
  • 1
    Well, no. You could asume that I want to clone the entire disk in this scenario, including GPT/MBR and maybe other partitions. This would result in two disks with identical uuids and if you where right, writing to one of them would change the other. Very bad in my opinion. – Thilo Mar 28 '14 at 20:32
  • If you wanted to clone the entire disk, you could dd the boot partition, and partition layout, create a new btrfs volume on the new disk then use btfs-send/btrfs-receive to make a (much more efficient) copy of the data. – Matthew Ife Mar 28 '14 at 20:39
  • I know that this is possible, but it would not be intuitive, as ext3/ext4 and other fs' allow this type of dd cloning without problems. But I think you are right with this uuid thing. I edited my question and added some tests I did. You can only mount the cloned device when the original device is unmounted, otherwise the new mount turns out to be something like a --bind mount of the original. My question now is: How can I change the uuid of the cloned btrfs to some new unused uuid. After that I would be able to mount the cloned device alongside the original one, is suspect. – Thilo Mar 28 '14 at 20:57