3

Created raidz1 drive array on CentOS8. Five drives and after reboot zpool would not load.

Did a zpool import -a, to import the zpool and datasets. I had a missing drive, which bugs me to no avail, I've been using ZFS for awhile and always use /dev/disk/by-id names. I tried replacing drive with same drive using zpool replace and got the drive is in use and contains unknown filesystem. The drive is available by dev/disk/by-id name and /dev name.

I wiped drive using wipefs --all. I remove partitions with gdisk, and still received the error.

And, I physically replaced the drive with another drive and still received that error.

How can I get this drive back in the zpool?

I have attached a zpool status and lsblk.

[root@beast by-id]# lsblk

NAME                                          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                             8:0    0 931.5G  0 disk  
└─ST91000640NS_00AJ142_00AJ145IBM_9XG9WE5R    253:2    0 931.5G  0 mpath 
  ├─ST91000640NS_00AJ142_00AJ145IBM_9XG9WE5R1 253:5    0 931.5G  0 part  
  └─ST91000640NS_00AJ142_00AJ145IBM_9XG9WE5R9 253:7    0     8M  0 part  
sdb                                             8:16   0 931.5G  0 disk  
└─ST91000640NS_9XG6KFS5                       253:3    0 931.5G  0 mpath 
  ├─ST91000640NS_9XG6KFS5p1                   253:11   0 931.5G  0 part  
  └─ST91000640NS_9XG6KFS5p9                   253:12   0     8M  0 part  
sdc                                             8:32   0 931.5G  0 disk  
└─ST1000NX0423_S4702HQ1                       253:4    0 931.5G  0 mpath 
  ├─ST1000NX0423_S4702HQ1p1                   253:8    0 931.5G  0 part  
  └─ST1000NX0423_S4702HQ1p9                   253:9    0     8M  0 part  
sdd                                             8:48   0 931.5G  0 disk  
└─ST91000640NS_9XG916G1                       253:6    0 931.5G  0 mpath 
sde                                             8:64   0 931.5G  0 disk  
└─ST1000NX0313_S4714LHN                       253:10   0 931.5G  0 mpath 
  ├─ST1000NX0313_S4714LHN1                    253:13   0 931.5G  0 part  
  └─ST1000NX0313_S4714LHN9                    253:14   0     8M  0 part  
sdf                                             8:80   0 931.5G  0 disk  
├─sdf1                                          8:81   0     4G  0 part  /boot
└─sdf2                                          8:82   0   916G  0 part  
  ├─cl-root                                   253:0    0   900G  0 lvm   /
  └─cl-swap                                   253:1    0    16G  0 lvm   [SWAP]
[root@beast by-id]# 

[root@beast by-id]# zpool status
  pool: data
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Online the device using 'zpool online' or replace the device with
    'zpool replace'.
  scan: none requested
config:

    NAME                                           STATE     READ WRITE CKSUM
    data                                           DEGRADED     0     0     0
      raidz1-0                                     DEGRADED     0     0     0
        ST91000640NS_00AJ142_00AJ145IBM_9XG9WE5R1  ONLINE       0     0     0
        ST91000640NS_9XG6KFS5                      ONLINE       0     0     0
        ST1000NX0423_S4702HQ1                      ONLINE       0     0     0
        7469149506902765747                        OFFLINE      0     0     0  was /dev/disk/by-id/ata-ST91000640NS_9XG4167A-part1
        ST1000NX0313_S4714LHN1                     ONLINE       0     0     0

errors: No known data errors
[root@beast by-id]# 


DaBus
  • 51
  • 1
  • 3

2 Answers2

1

I was able to add the drive by removing multipath signature.

I did a "multipath -ll" to get a listing of all the signatures, then did a multipath -f {MULTIPATH NAME}" and then was able to executed zpool replace to add drive back to zpool.

DaBus
  • 51
  • 1
  • 3
1

Late answer, but maybe it can be useful to others...

When ZFS reports "drive is in use" it means that ZFS can not exclusively open the needed block devices. It generally depends on the kernel having open the drive/blockdev for/by another storage component, as (but not limited to):

  • mdraid
  • multipath
  • lvm

As these build-in component are activated early during the boot phase (often in pre-root environment as dracut and the likes), when ZFS kernel module is inserted and pools are imported they found the drives/blockdev already busy - hence the error message.

To solve it, one generally has to deactivate the offending service or blacklist the device from it. Sometime removing any old signature is also necessary (ie: for mdraid auto-activation skip).

shodanshok
  • 47,711
  • 7
  • 111
  • 180