0

Yesterday i got a power issue in my datacenter, my nobreakers failed after 30min, resulting in one of the worst scenario i ever seen until now. I am running a freeNas server, using a raidz1-0. After the power on, i noticed a critical alert:

The volume Raid (ZFS) state is DEGRADED: One or more devices has experienced an error resulting in data corruption. Applications may be affected.

So i checked the disk status its more serious than i thought, running a "zpool status -v"

i got the following message:

  pool: Raid
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub in progress since Sun Feb 11 19:47:09 2018
        14.0T scanned out of 18.1T at 155M/s, 7h48m to go
        8K repaired, 77.14% done
config:

        NAME                                            STATE     READ WRITE CKSUM
        Raid                                            DEGRADED     0     0 75.1K
          raidz1-0                                      DEGRADED     0     0  150K
            gptid/d5a65a3d-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors  (repairing)
            gptid/d642db6c-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors  (repairing)
            gptid/d6d69c95-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors  (repairing)
            gptid/d7860535-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors
            gptid/d82ec964-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors
            gptid/aec9036c-4f4b-11e6-a2f2-b083fed00972  DEGRADED     0     0     0  too many errors
            gptid/d97ceea1-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     9  too many errors  (repairing)
            gptid/da14eaee-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors  (repairing)
            gptid/dabd3055-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors  (repairing)
            gptid/db58a590-4eac-11e6-aebb-b083fed00972  DEGRADED     0     0     0  too many errors  (repairing)

My entire disk array is degraded, but the blinking led are showing "ok". Right now i am trying a scrub, maybe that's not going to work. I am in panic, because there are two ISCSI volume, containing 6 VM servers. I mounted those iscsi disks in a linux machine in order to move my server files from there, but i got I/O error while running cp and rsync.

Someone experienced something like this? Is there anything to do?

My server setup is: Dell PowerEdge R720 Storage Server 10x HD Dell 4TB 15k RPM 65GB RAM Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz

Any suggestion will be appreciate.

0 Answers0