I'm pretty new to zfsonlinux. I've just succeeded in setting up a brand new server, with a Debian ROOT on ZFS. All is working fine, but I've got an issue with hot spare and replacing disks.
Here is my pool:
NAME STATE READ WRITE CKSUM
mpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST1XXXXXXXXXXA-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXB-part1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXC-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXD-part1 ONLINE 0 0 0
spares
ata-ST1XXXXXXXXXXE-part1 AVAIL
ata-ST1XXXXXXXXXXF-part1 AVAIL
Now, I'm able to start with the real fun. Disk pulling! I'm now unplugging disk C. I got a working pool, but DEGRADED (as expected):
NAME STATE READ WRITE CKSUM
mpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST1XXXXXXXXXXA-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXB-part1 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
ata-ST1XXXXXXXXXXC-part1 UNAVAIL 0 0 0
ata-ST1XXXXXXXXXXD-part1 ONLINE 0 0 0
spares
ata-ST1XXXXXXXXXXE-part1 AVAIL
ata-ST1XXXXXXXXXXF-part1 AVAIL
So far, so good. But, when I try to replace disk C with, let's say, disk E, I'm stuck with a DEGRADED pool anyway.
# zpool replace mpool ata-ST1XXXXXXXXXXC-part1 ata-ST1XXXXXXXXXXE-part1
cannot open '/dev/disk/by-id/ata-ST1XXXXXXXXXXE-part1': Device or ressource busy
(and after a few sec)
Make sure to wait until resilver is done before rebooting.
So I'm waiting some secs to let resilvering (with 0 errors), then I've got:
NAME STATE READ WRITE CKSUM
mpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST1XXXXXXXXXXA-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXB-part1 ONLINE 0 0 0
mirror-1 DEGRADED 0 0 0
spare-0 UNAVAIL
ata-ST1XXXXXXXXXXC-part1 UNAVAIL 0 0 0
ata-ST1XXXXXXXXXXE-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXD-part1 ONLINE 0 0 0
spares
ata-ST1XXXXXXXXXXE-part1 INUSE currently in use
ata-ST1XXXXXXXXXXF-part1 AVAIL
Then if I zpool detach
the C disk (as explained here), my pool is getting ONLINE again, and all is working fine (with a pool with only 5 HDD)
So here are my questions:
- Why replacing the C disk is not enough to rebuild a full pool? As explained on the oracle blog and here too I was expecting that I do not have to detach the disk for zfs to rebuild the pool properly (and it's far better to keep in the zpool status traces of the unplugged disk, for maintening convenience)
- Why zpool keep telling me that spares disks are "busy" (they are truly not)?
- See below: how can I automatically get my spare disk back?
EDIT: Even worst for question1 => When I plug back in disk C, zfs don't manage my spare back! So I'm left with one less disk
NAME STATE READ WRITE CKSUM
mpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-ST1XXXXXXXXXXA-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXB-part1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXE-part1 ONLINE 0 0 0
ata-ST1XXXXXXXXXXD-part1 ONLINE 0 0 0
spares
ata-ST1XXXXXXXXXXF-part1 AVAIL