2

I'm using ZFS on a Debian 9 machine. This machine has been working for years without any problem until today.

The zfs pool is mounted on top of a RAID system, controlled by hardware (so only one drive is exposed to Linux as sda). You can see the output of "zpool status" below.

Before continuing, just mention that I checked the consistency of the RAID, and everything is fine.

Suddenly, all accesses to the filesystem provoke the command to freeze (even an ls command), and eventually, I need to reboot the machine manually.

When running zpool status -v, the output is:

#/sbin/zpool status -v
  pool: export
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 53h4m with 0 errors on Tue Mar 15 05:28:38 2022
config:

        NAME        STATE     READ WRITE CKSUM
        export      ONLINE       0     0     0
          sda       ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        export/home:<0x0>
        export/home:<0x2b2ed23>
        export/home:<0x2e1183b>
        export/home:<0x2b2e849>
        export/home:<0x1d0b5b1>

So, the main question is: What is the meaning of those files? How do I fix this problem?

Thank you in advance!

marolafm
  • 123
  • 3

2 Answers2

2

Run a zpool clear and two scrubs if you can, then see the result.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
1

Those was corrupted files and now remains metadata:

export/home:<0x0>
export/home:<0x2b2ed23>
export/home:<0x2e1183b>
export/home:<0x2b2e849>
export/home:<0x1d0b5b1>

The cause is probably a hardware failure, but you need more information to point the root cause and you will probably be stopped by your RAID card.

Using a RAID hardware device under ZFS is not recommended to avoid the exact situation you encounters: hard time to diagnose issues.

My two cents:

  • let ZFS manage your disks (it is made for it)
  • use the most recent ZFS version (and a adequat OS)
freezed
  • 133
  • 11
  • Hardware RAID controllers don't fall over at an excessive rate for other filesystems. It's misleading to state that they're unacceptable for ZFS, or that single-lun/single device pools backed by hardware RAID is unacceptable. I common use case is an export from a SAN, ZFS in a VM or any situation where one wants to leverage ZFS volume management features without using the RAID features. – ewwhite May 07 '22 at 08:16
  • OK then, so what are your suggestions to diagnose this case @ewwhite? – freezed May 08 '22 at 08:13
  • 1
    Oh, a `zpool clear` and two scrubs. – ewwhite May 11 '22 at 12:18