-1

Two disks just failed in my zpool with read/write errors, so I took them out and inserted them in another host, created a zpool just with them, and filled the file system with

dd if=/dev/zero of=/crashpool/zero bs=1M count=1000000000000

When it had maxed out the file system I expected to see the same errors in zpool status, but the disks had not failed.

Question

Why can I not reproduce the errors on another host, when ZFS have just reported the disks to have read/write errors?

ewwhite
  • 197,159
  • 92
  • 443
  • 809
Sandra
  • 10,303
  • 38
  • 112
  • 165
  • 3
    Have you tried a `zfs scrub` on the new array? ZFS will only report errors if it comes across them during normal disk activity, or when checking the disk with a scrub. Normal disk activity may not touch the part of the disk that is bad, but a scrub will scan all data within the pool and may find the bad sectors. Normally, ZFS will mark those sectors as bad, reconstruct the data to another portion of the disk and will move on. – Stefan Lasiewski Jan 19 '14 at 23:20
  • Did you check the SMART statistics? A disk might have relocated faulty sectors as you have rewritten it with zeros – the-wabbit Jan 21 '14 at 06:46

2 Answers2

2

Just because you can doesn't mean you should. ZFS is not lying to you. If it's reporting the disks as bad in the context of a pool, I would likely not use those drives elsewhere.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
  • 1
    I don't doubt ZFS' errors reports, I just find it very strange I can't get ZFS to report the errors again in another pool. The disks are going into the pile of RMA disks =) – Sandra Jan 19 '14 at 22:40
0

If I had to guess, I'd say that something might be wrong in either the cabling or the controller. It can be something so simple as a loose data cable, or there might be something wrong with the controller itself. If you have added replacement drives on the original host, and performed the resilver, I'd say that you should check either the output of zpool status or/and dmesg for any kernel errors.

Of course syneticon-dj has a point in his comment. The disk might have replaced any faulty sectors as a result of the dd command, in which case I doubt that you will be able to reproduce the errors easily.

user76776
  • 436
  • 2
  • 4