0

I have a high availability cluster with two nodes, with a resource for drbd, a virtual IP and the mariaDB files shared on the drbd partition.

Everything seems to work OK, but drbd is not syncing the latest files I have created, even though drbd status tells me they are UpToDate.

sudo drbdadm status 
iba role:Primary
  disk:UpToDate

Pcs also does not show errors

sudo pcs status 
Cluster name: cluster_iba
Cluster Summary:
  * Stack: corosync
  * Current DC: iba2-ip192 (version 2.0.3-4b1f869f0f) - partition with quorum
  * Last updated: Tue Feb 22 18:16:20 2022
  * Last change:  Mon Feb 21 16:19:38 2022 by root via cibadmin on iba1-ip192
  * 2 nodes configured
  * 6 resource instances configured

Node List:
  * Online: [ iba1-ip192 iba2-ip192 ]

Full List of Resources:
  * virtual_ip  (ocf::heartbeat:IPaddr2):    Started iba2-ip192
  * Clone Set: DrbdData-clone [DrbdData] (promotable):
    * Masters: [ iba2-ip192 ]
    * Slaves: [ iba1-ip192 ]
  * DrbdFS  (ocf::heartbeat:Filesystem):     Started iba2-ip192
  * WebServer   (ocf::heartbeat:apache):     Started iba2-ip192
  * Maria   (ocf::heartbeat:mysql):  Started iba2-ip192

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

All constraint:

sudo pcs constraint list --full
Location Constraints:
Ordering Constraints:
  promote DrbdData-clone then start DrbdFS (kind:Mandatory) (id:order-DrbdData-clone-DrbdFS-mandatory)
  start DrbdFS then start virtual_ip (kind:Mandatory) (id:order-DrbdFS-virtual_ip-mandatory)
  start virtual_ip then start WebServer (kind:Mandatory) (id:order-virtual_ip-WebServer-mandatory)
  start DrbdFS then start Maria (kind:Mandatory) (id:order-DrbdFS-Maria-mandatory)
Colocation Constraints:
  DrbdFS with DrbdData-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-DrbdFS-DrbdData-clone-INFINITY)
  virtual_ip with DrbdFS (score:INFINITY) (id:colocation-virtual_ip-DrbdFS-INFINITY)
  WebServer with virtual_ip (score:INFINITY) (id:colocation-WebServer-virtual_ip-INFINITY)
  Maria with DrbdFS (score:INFINITY) (id:colocation-Maria-DrbdFS-INFINITY)
Ticket Constraints:

The files in /mnt/datosDRBD in node iba2-ip192 (when it's the master),

/mnt/datosDRBD$ ls -l
total 80
-rw-r--r-- 1 root  root   5801 feb 21 12:16 drbd_cfg
-rw-r--r-- 1 root  root  10494 feb 21 12:18 fs_cfg
drwx------ 2 root  root  16384 feb 21 10:12 lost+found
drwxr-xr-x 4 mysql mysql  4096 feb 22 18:00 mariaDB
-rw-r--r-- 1 root  root  17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root  root      5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root  root  13578 feb 21 12:21 WebServer_cfg

And the files in /mnt/datosDRBD in node iba1-ip192 (when it's the master),

ls -l
total 92
-rw-r--r-- 1 root     root      5801 feb 21 12:16 drbd_cfg
drwxrwxrwx 5 www-data www-data  4096 feb 22 13:41 FilesSGITV
-rw-r--r-- 1 root     root     10494 feb 21 12:18 fs_cfg
drwx------ 2 root     root     16384 feb 21 10:12 lost+found
drwxr-xr-x 7 mysql    mysql     4096 feb 22 17:55 mariaDB
-rw-r--r-- 1 root     root     17942 feb 21 12:39 MariaDB_cfg
-rw-r--r-- 1 root     root         5 feb 22 17:58 testMParicio2.txt
-rw-r--r-- 1 www-data www-data     9 feb 22 17:58 testMParicio3.txt
-rw-r--r-- 1 root     root         5 feb 21 10:13 testMParicio.txt
-rw-r--r-- 1 root     root     13578 feb 21 12:21 WebServer_cfg

All new files, testMParicio2.txt testMParicio3.txt and the folder FilesSGITV are missing.

I do not know what to do. I am very lost.

I appreciate any help, thanks.

(EDIT)

My config for drbd, in both nodes...

cat /etc/drbd.conf 
# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

And my *.res config, in both nodes too:

resource iba {
        device /dev/drbd0;
        disk /dev/md3;
                meta-disk internal;
                on iba1 {
                        address 10.0.0.248:7789;
                }
                on iba2  {
                        address 10.0.0.249:7789;
                }
}

drbdadm use iba1 and iba2, with IP 10.0.0.248 and 10.0.0.249

Corosync use iba1-ip192 and iba2-192, with IP 192.168.1.248 and 192.168.1.249

cat /etc/hosts
127.0.0.1 localhost
#127.0.1.1 iba1
10.0.0.248  iba1
10.0.0.249  iba2
192.168.1.248 iba1-ip192
192.168.1.249 iba2-ip192
cat /etc/drbd.d/global_common.conf


global {
    usage-count yes;
    
    udev-always-use-vnr; # treat implicit the same as explicit volumes

}

common {
    handlers {
    }

    startup {
    }

    options {
    }

    disk {
    }

    net {
        protocol C;
    }
}

(EDIT 2)

I have found a problem in /proc/drbd

In primary node:

cat /proc/drbd 
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C 
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:2284 dr:11625 al:6 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:42364728

in secondary node

cat /proc/drbd 
version: 8.4.11 (api:1/proto:86-101)
srcversion: FC3433D849E3B88C1E7B55C 
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:36538580

Secondary node don't remember ssh key, fix with

ssh-keygen  -R 10.0.0.248
ssh-copy-id iba@iba1

But drbd still with StandAlone status.
I don't know how to continue

  • What does your DRBD configuration look like (`/etc/drbd.conf` or `/etc/drbd.d/*.res`)? The output from `drbdadm status` does not show the peer's status in your outputs, only the local node is shown. That leads me to believe DRBD is configured on both nodes, but doesn't include both nodes. – Matt Kereczman Feb 22 '22 at 23:25
  • 1
    Hi, thanks for your response, the drbdadm status output changed when I configured Corosync and Pacemaker.
    I will update my first post with the status you comment
    – Maria Paricio Blasco Feb 23 '22 at 07:49

1 Answers1

0

I have found a Split-Brain that did not appear in the status of pcs.

sudo journalctl | grep Split-Brain
feb 21 13:00:10 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:21:40 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
feb 21 13:27:54 ibatec1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!

I have stopped the cluster, with --force on the master, Then... On split-brain victim (assuming the DRBD resource is iba):

drbdadm disconnect iba
drbdadm secondary iba
drbdadm connect --discard-my-data iba

On split-brain survivor:

drbdadm primary iba
drbdadm connect iba
  • I think you may have also found a little bug in the DRBD utils. You should have seen the connection to the peer with your `drbdadm status` command. Do you mind commenting with your `drbdadm --version` output, please and thanks!? – Matt Kereczman Feb 23 '22 at 18:53
  • I don't know, it's strange that pcs doesn't show split-brain in the status ```DRBDADM_BUILDTAG=GIT-hash:\ 63092751e76e1fba397e53df4be5c1161b83a223\ reproducible\ build\,\ 2020-03-21\ 12:27:02 DRBDADM_API_VERSION=1 DRBD_KERNEL_VERSION_CODE=0x08040b DRBDADM_VERSION_CODE=0x090b00 DRBDADM_VERSION=9.11.0 ``` – Maria Paricio Blasco Feb 24 '22 at 19:33