1

I have a ZFS pool in the current state:

[root@zfs01 ~]# zpool status
  pool: zdata
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 186h53m with 0 errors on Sun Jan 27 20:53:44 2019
config:

        NAME                                                  STATE     READ WRITE CKSUM
        zdata                                                 DEGRADED     0     0     0
          raidz3-0                                            DEGRADED     0     0     0
            ata-HGST_HUH728080ALE604_2EGWK97X                 ONLINE       0     0     0
            spare-1                                           DEGRADED     0     0     2
              ata-HGST_HUH728080ALE604_2EGWHSGX               UNAVAIL      0     0     0
              ata-HGST_HUH728080ALE604_2EGWD3WX               ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGGVUTX                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EHG14TX                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGWW4XX                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGW5A5X                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGWTPYX                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGWALNX                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGWNN1X                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGWG0BX                 ONLINE       0     0     0
            ata-HGST_HUH728080ALE604_2EGWWTGX                 ONLINE       0     0     0
        logs
          mirror-1                                            ONLINE       0     0     0
            ata-INTEL_SSDSC2BA100G3_BTTV5435005Y100FGN-part1  ONLINE       0     0     0
            ata-INTEL_SSDSC2BA100G3_BTTV54350016100FGN-part1  ONLINE       0     0     0
        cache
          ata-INTEL_SSDSC2BA100G3_BTTV5435005Y100FGN-part2    ONLINE       0     0     0
          ata-INTEL_SSDSC2BA100G3_BTTV54350016100FGN-part2    ONLINE       0     0     0
        spares
          ata-HGST_HUH728080ALE604_2EGWD3WX                   INUSE     currently in use

errors: No known data errors

As you can see, I have added the spare disk "ata-HGST_HUH728080ALE604_2EGWD3WX" via the command zpool replace zdata /dev/disk/by-id/ata-HGST_HUH728080ALE604_2EGWHSGX /dev/disk/by-id/ata-HGST_HUH728080ALE604_2EGWD3WX which has now created a spare-1 device with boths disks in it (was expecting the spare 2EGWD3WX drive to replace the dead 2EGWHSGX one.)

How do I remove the dead drive ata-HGST_HUH728080ALE604_2EGWHSGX now?

Will Dennis
  • 304
  • 4
  • 16

1 Answers1

5

To replace a failed disk with a hot spare, you do not need to zpool replace at all (and in fact this might cause you all sorts of grief later; I've never done this). Instead you are supposed to simply zpool detach the failed disk and the hot spare automatically replaces it.

Michael Hampton
  • 244,070
  • 43
  • 506
  • 972
  • Good to know for the next time... (but doesn't really answer my question) – Will Dennis Feb 02 '19 at 20:09
  • So what happened when you did it? – Michael Hampton Feb 02 '19 at 20:47
  • As you can see above, it made a "spare-1" device containing both the old failed and spare disks. The spare resilvered, and has "taken the place" of the old failed disk, except that now I need to remove the old failed disk from the pool. I'm just needing the correct command to do that. – Will Dennis Feb 02 '19 at 20:55
  • I already gave you the command to do that! – Michael Hampton Feb 02 '19 at 20:55
  • 1
    Sorry, that wasn't clear to me... So `zpool detach zdata /dev/disk/by-id/ata-HGST_HUH728080ALE604_2EGWHSGX` then? – Will Dennis Feb 02 '19 at 21:29
  • @WillDennis Just double check that's the `UNAVAIL` disk. It looks right to me, but never hurts to be sure. BTW, you don't have to use the full path; the basename shown in `zpool status` is sufficient. You only need the full path to add devices to a zpool. – Michael Hampton Feb 02 '19 at 21:49
  • Ran `sudo zpool detach `. Online spare with `AVAIL` status did *not* take over. Physical device detached no longer shows up under `/dev` (i.e., `/dev/sdb` is gone now). Running into other issues that I'm now trying to troubleshoot. The worst part is that this answer *seemed* to make logical sense. It actually appears to be wrong. – code_dredd Apr 11 '22 at 21:38
  • This doesn't seem to work anyway trying to remove a disk from a raidz3 vdev `cannot detach gptid/ba0ac1e5-16e2-11eb-b014-00155d00e20e: only applicable to mirror and replacing vdevs` – nijave Nov 12 '22 at 01:11
  • Sometimes you can let ZFS decide what it needs to do, by exporting your pool, then reimporting it. But only do that if you have enough redundancy. E.g. if a raidz2/3, only do that if one drive is acting up, and status is good on all the rest. Otherwise you dont want to leave the pool or reboot until things are fixed. For my faulted drive, it started resilvering it when i exported and reimported. for you, i think it will notice your drive is missing and enable the hot spare if it didnt already do it automatically when you ran your initial detach command. – Brian Thomas Aug 16 '23 at 17:01