2

I need to replace a DRBD backend disk due to worn out but unsure how to proceed. Setup is as follows:

server0 <----> server1

Server0 is the one affected, DRBD process has been stopped on it. Server1 is the master server at the moment, it's DRBD status look like that:

cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r-----
    ns:4 nr:12 dw:16 dr:937 al:0 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Inconsistent C r-----
    ns:10167368 nr:1357185492 dw:2024894776 dr:67769600 al:326677858 bm:1111517 lo:2 pe:0 ua:0 ap:1 ep:1 wo:f oos:305611780

Worn out disk has already been replaced on server0 and DRBD is configured to use internal meta data server.

DRDB config on server0:

resource r0 {

on server0 {
    device     /dev/drbd0;
    disk       /dev/fioa2;
    address    10.10.10.3:7788;
    meta-disk  /dev/fioa1[0];
}

on server1 {
    device     /dev/drbd0;
    disk       /dev/fioa2;
    address    10.10.10.4:7788;
    meta-disk  /dev/fioa1[0];
}
}

resource r1 {

device     /dev/drbd1;
disk       /dev/fiob2;
meta-disk  /dev/fiob1[0];

on server0 {
    address    10.10.10.3:7789;
}

on server1 {
    address    10.10.10.4:7789;
}
}

What would be the procedure to initialise the disk? My main concern is not to lose/corrupt any data on server1 at the moment.

UPDATE: new disk that's been inserted to server0 is bigger capacity then the old one, not sure if that affects the whole process

ptrh
  • 369
  • 1
  • 2
  • 10
  • 1
    There is a whole chapter for this in the [docs](http://docs.linbit.com/docs/users-guide-8.3/p-work/#s-hard-drive-failure) – Lenniey Aug 17 '17 at 09:53
  • yes I know and I read that. The things I'm unsure of are the current state of the cluster, specifically WFConnection and weather I need to partition new disk and create 2 partitions one for metadata and one for resource? With WFConnection if i start DRBD on server0 it will automatically connect to server1 - if that the case I just follow the docs. Still unsure about partitions though – ptrh Aug 17 '17 at 09:59
  • 1
    You don't have to stop the DRBD service on server0. It should be configured to automatically unbind the failed disk. This is how I'd do it: remove new disk in server0, start DRBD service, check output of `drbdadm dstate `, insert new disk, execute `drbdadm create-md ` and finally `drbdadm attach `, then watch for split brains and solve them manually if needed. As DRBD uses block level replication, there is no need to create partitions. – Lenniey Aug 17 '17 at 10:07
  • It all makes sense. The only thing with that is that old card has already been removed and new one is inserted. It is recognized under same id by the OS so DRBD config remains unchanged. As the device has already been replaced how would you proceed in that scenario? Im comfortable resolving split brains if they appear – ptrh Aug 17 '17 at 10:55
  • 1
    Well then I'd just start the DRBD service and observe what happens, then recreate MD and so on :) Manually intervening if necessary, of course. – Lenniey Aug 17 '17 at 11:11
  • Last question really, should i be concerned about disk size difference? New card is bigger than the old one (server0 and server1 initially had the same hardware), will that cause any issues? – ptrh Aug 17 '17 at 12:41
  • Nope, won't be a problem usually. If your new disk was smaller, then you'd maybe have problems with your filesystem on top of DRBD, but this way, you'll only waste some space on your new disk. – Lenniey Aug 17 '17 at 12:43

1 Answers1

2

Simply recreate the metadata for the new devices on server0, and bring them up:

# drbdadm create-md all
# drbdadm up all

You should then see your devices connect and start syncing from server1 -> server0

They will both agree upon a size when they first connect, which will be the size of the smallest disk.

Hope that helps.

Matt Kereczman
  • 1,899
  • 9
  • 12