I'm pretty new to DRBD and NFS, and am in the process of testing a DRDB server with heartbeat to use for our company's NFS share.
The entire setup is running fine, with the NFS state directory running on the DRDB share as together with the actual share.
The problem I'm having comes from the failover scenario I'm testing. In the failover scenario, I create a problem where I disable the the network connection (ifconfig eth0 down) on node1. The failover works awesome and does its job in about 5 seconds, but when I bring it back up (ifconfig eth0 up, and service heartbeat start if it's stopped) it takes upward of 3-5 minutes to get it back up during which the NFS share is unavailable.
In a web environment, this 3-5 minutes is pretty significant for downtime. Is this normal? What am I doing wrong?
I pasted our drbd.conf file below.
resource r0 {
protocol C;
startup {
degr-wfc-timeout 60; # 1 minute.
}
disk {
on-io-error detach;
}
net {
}
syncer {
rate 10M;
al-extents 257;
}
on tsa-dev-nfstest1 { # ** EDIT ** the hostname of server 1 (un
device /dev/drbd0; #
disk /dev/sdc1; # ** EDIT ** data partition on server 1
address 10.61.2.176:7788; # ** EDIT ** IP address on server 1
meta-disk /dev/sdb1[0]; # ** EDIT ** 128MB partition for DRBD on serve
}
on tsa-dev-nfstest2 { # ** EDIT ** the hostname of server 2 (un
device /dev/drbd0; #
disk /dev/sdc1; # ** EDIT ** data partition on server 2
address 10.61.2.177:7788; # ** EDIT ** IP address on server 2
meta-disk /dev/sdb1[0]; # ** EDIT ** 128MB partition for DRBD on serve
}
}