When trying to replace a disk in a zpool on a FreeBSD 10.3-RELEASE-p20 system yesterday, the zfs filesystems became unresponsive after issuing the zpool detach srv gpt/d0
command. The server acts as an NFS server, WebDAV server and iSCSI target, and after executing zpool detach
all iSCSI clients started experiencing timeouts.
This apparently caused the entire ZFS subsystem to lock up. zpool status
or any other command would just hang and produce no output. There was nothing showing in dmesg
, and top
didn't show any processes consuming a large amount of CPU. In the end we were unable to find any solution and were forced to reboot the system (including using a hard reboot because a soft restart failed to restart the system after stopping all services) in order to get the iSCSI targets back online.
What causes this situation and how can we avoid it? How can we prevent zpool detach
from hanging when replacing a device in a ZFS pool under FreeBSD?