1

Context:

Active/passive DRBD setup. KVM VMs on LVM volumes. LVM using /dev/drbd0 for physical volume and volume group.

During tests of the setup I have:

  1. Shut down Secondary node.
  2. Rebooted Primary node.

However, a VM that was autostarted after Primary reboot somehow accessed underlying disk (/dev/sda4) for /dev/drbd0:

WARNING: Device mismatch detected for vgr0/r0_wphp which is accessing /dev/sda4 instead of /dev/drbd0.

/dev/sda4 is a device used for /dev/drbd0:

resource r0 {
        protocol C;
        startup {
                wfc-timeout  15;
                degr-wfc-timeout 60;
        }
        disk {
                on-io-error     detach;
                c-fill-target   10M;
                c-max-rate      700M;
                c-plan-ahead    7;
                c-min-rate      4M;
        }
        net {
                # max-epoch-size  20000;
                max-buffers       36k;
                sndbuf-size       1024k;
                rcvbuf-size       2048k;
                after-sb-0pri    discard-zero-changes;
                after-sb-1pri    discard-secondary;
                after-sb-2pri    disconnect;
                rr-conflict      disconnect;                
        }
        syncer {
                rate                    400M;
                al-extents              6433;
        }
        on NormallySecondary {
                device /dev/drbd0;
                disk /dev/sdc;
                address 10.0.0.1:7788;
                meta-disk internal;
        }
        on NormallyPrimary {
                device /dev/drbd0;
                disk /dev/sda4;
                address 10.0.0.2:7788;
                meta-disk internal;
        }
}

After unintended start of r0_wphp VM now I have this:

% lsblk
NAME               MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                  8:0    0  3.7T  0 disk 
├─sda1               8:1    0 46.6G  0 part /
├─sda2               8:2    0 46.6G  0 part [SWAP]
├─sda3               8:3    0  1.7T  0 part 
...
└─sda4               8:4    0  1.8T  0 part 
  └─vgr0-r0_wphp   254:0    0   40G  0 lvm

vgr0-r0_wphp (LVM volume used by r0_wphp VM) should use volume group vgr0 that is using /dev/drbd0 as physical volume, not /dev/sda4 directly.

I think this is the reason for DRBD now refusing to start r0 on my NormallyPrimary node:

% drbdadm create-md r0
open(/dev/sda4) failed: Device or resource busy

Exclusive open failed. Do it anyways?
[need to type 'yes' to confirm]

Certainly lsof /dev/sda4 does not show anything.

drbd service was not started automatically on booting NormallyPrimary node.

The weird thing is that before tests I have configured LVM to ignore /dev/sda4:

 % egrep '^\s*filter =' /etc/lvm/lvm.conf 
    filter = [ "r|/dev/sda4|" ]

After defining this LVM filter, drbdadm created r0 resource without complaints, the DRBD setup was working.

Questions:

  1. How can I stop the VM in question from using /dev/sda4? I need to re-attach my DRBD resource (r0).

  2. How can I prevent any LVM-backed VMs from trying to access devices underlying /dev/drbd* devices even when unintentionally started?

OS and DRBD:

% lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 9.6 (stretch)
Release:    9.6
Codename:   stretch

% dpkg -l | grep drbd
ii  drbd-utils                                    8.9.10-2                                    amd64        RAID 1 over TCP/IP for Linux (user utilities)
LetMeSOThat4U
  • 1,371
  • 2
  • 17
  • 35
  • 2
    After changing lvm.conf you must [`update-initramfs`](https://manpages.debian.org/stretch/initramfs-tools/update-initramfs.8.en.html) and reboot. – Michael Hampton Jan 09 '19 at 22:20

0 Answers0