1

I am not a sys admin but inherited some servers setup with no documentation in linux. Today the server died in a way it was unresponsive and the VMs running on it down...after a good few hours the server reboot itself, so could ssh to it again but realized what used to show up in /dev as

/dev/drbd0 /dev/drbd1 etc etc

Are no longer there at all...I am guessing a drive or a series of drives went kaput. A command

cli64 vsf info

Shows that my areca disk array is checking three (volumes? block devices? thingies?) and doing it slowwwlllyyyy...

  # Name             Raid Name       Level   Capacity Ch/Id/Lun  State
===============================================================================
  1 ARC-1883-VOL#000 vm-cache        Raid3    300.0GB 00/00/00   Normal
  2 ARC-1883-VOL#001 data            Raid6   12000.0GB 00/00/01   Checking(50.4%)
  3 ARC-1883-VOL#002 apogee          Raid6   9000.0GB 00/00/02   Checking(50.6%)
  4 ARC-1883-VOL#004 database        Raid1+0 3000.0GB 00/00/03   Normal
  5 ARC-1883-VOL#005 system          Raid1+0 3000.0GB 00/00/04   Normal
  6 ARC-1883-VOL#006 archive         Raid6   6000.0GB 00/00/05   Checking(74.3%)
  7 VM-Cache Backup  VM-Cache Backup Raid1+0 2000.0GB 00/00/06   Normal
  8 VS Apogee Backup RS Apogee BackupRaid0   3000.0GB 00/00/07   Normal
  9 ARC-1883-VOL#008 TPM             Raid1+0 1500.0GB 00/01/00   Normal
 10 SDSS-BACKUP-VOLU SDSS-BACKUP-RAIDRaid0   1000.0GB 00/01/01   Normal
===============================================================================
GuiErrMsg<0x00>: Success.

It is my hope once the checks are done I will once again see the /dev/drbd folders so I can mount them and get my VM Image files off of them .... though I think that is wishful thinking. I am not sure what else to poke around to find and try to have it where drbd once again exists in my /dev directory.

Normally the command to get the VMs setup and ready to use is a :

drbdadm primary --force all

mount -o noatime /dev/drbd/by-res/vm-cache /vm-cache

then lo and behold the /vm-cache has all the .img files..... Though with the /dev/drbd missing, this mount is of course failling.

Codejoy
  • 107
  • 5
  • 17
  • Any *drbd* logs, `/etc/drbd.d/*` configuration, `drbdadm status` output, current `mount`s? (assumption: you are storing VM volumes, in drbd, in raid-backed volumes, and up to raid-volmes-level no data is lost, yet) – anx Oct 14 '20 at 00:32
  • 1
    So we found out that there was a secondary server, that was supposed to be in the DRBD cluster. Though it must not of been operating since bringing that up all the VMs were like a year old (Before I even started at the gig here). That means drbd stopped working somewhere somehow and had no idea. Also drbdadm status says unknown command status :/ Current mounts don't show anything from drbd mounted. In general yes VM's are stored in /dev/drbd/by-res/vm-cache which is mounted to /vm-cache and then bam the kvm can see all the images and the virsh startup guest_name can happen. – Codejoy Oct 14 '20 at 01:28
  • the cli64 vsf info still shows two (are they volumes at this point) data and apogee being checked. at 75.3%. Once this finishes I might reboot the server and see if the /dev/drbd comes back. – Codejoy Oct 14 '20 at 01:33
  • I wonder if the drbd service has to be running to show a drbd in /dev – Codejoy Oct 14 '20 at 04:08
  • 1
    DRBD is a kernel module, starting the service simply inserts the drbd module if it's not already inserted (`modprobe drbd`). `drbdadm status` is a newer sub-command in the DRBD utils, you probably want to be using `cat /proc/drbd` to check the status of your devices. Use `drbdadm up all` to bring up your devices. It's likely you'll have a split-brain situation to resolve. I would recommend joining LINBIT's (the creators of DRBD's) Slack channel and see if someone (or me) can help you get things going again there, as there's probably a lot going on here (https://bit.ly/3jUM9J7). – Matt Kereczman Oct 14 '20 at 23:01
  • Ty for this it is great info. I did just finally get it in a usable state. It turns out we missed the basic of commands: drbd start. It then showed up under /drbd/dev and we were in business. There are a ton of 'startup scripts' on this older machine but it was hard to determine which were the ones meant to be used and which were not. It turns out we had no split brain situation on this in fact the backup machine it supposed to talk to we brought up and found out the VMs were stale by a year. – Codejoy Oct 14 '20 at 23:40

0 Answers0