RAID 5 keeps failing after recovery

Question

I have configured a raid 5 with 6x2TB devices with mdadm (I shrinked it from 9 devices to 6):

/dev/md0:
    Version : 1.2
    Creation Time : Mon Sep 23 17:54:25 2013
    Raid Level : raid5
    Array Size : 9762030080 (9309.80 GiB 9996.32 GB)
    Used Dev Size : 1952406016 (1861.96 GiB 1999.26 GB)
    Raid Devices : 6
    Total Devices : 6
    Persistence : Superblock is persistent

    Update Time : Tue Oct  6 09:48:34 2015
    State : clean, degraded, recovering
    Active Devices : 5
    Working Devices : 6
    Failed Devices : 0
    Spare Devices : 1

    Layout : left-symmetric
    Chunk Size : 512K

    Rebuild Status : 1% complete

       Name : media:0
       UUID : 8fe53fed:5206746d:3fcd5b2b:f176a8f9
     Events : 208638

Number   Major   Minor   RaidDevice State
   0       8       34        0      active sync   /dev/sdc2
   1       8       17        1      active sync   /dev/sdb1
   2       8       65        2      active sync   /dev/sde1
   6       8       80        3      spare rebuilding   /dev/sdf
   4       8       49        4      active sync   /dev/sdd1
   7       8       97        5      active sync   /dev/sdg1

As you can see the raid is currently recovering, but after/somewhere during the recovery process my raid keeps failing:

/dev/md0:
    Version : 1.2
    Creation Time : Mon Sep 23 17:54:25 2013
    Raid Level : raid5
    Array Size : 9762030080 (9309.80 GiB 9996.32 GB)
    Used Dev Size : 1952406016 (1861.96 GiB 1999.26 GB)
    Raid Devices : 6
    Total Devices : 6
    Persistence : Superblock is persistent

    Update Time : Tue Oct  6 08:33:43 2015
    State : clean, FAILED
    Active Devices : 4
    Working Devices : 5
    Failed Devices : 1
    Spare Devices : 1

    Layout : left-symmetric
    Chunk Size : 512K

       Name : media:0
       UUID : 8fe53fed:5206746d:3fcd5b2b:f176a8f9
     Events : 208622

Number   Major   Minor   RaidDevice State
   0       8       34        0      active sync   /dev/sdc2
   1       8       17        1      active sync   /dev/sdb1
   2       8       65        2      active sync   /dev/sde1
   3       0        0        3      removed
   4       0        0        4      removed
   7       8       97        5      active sync   /dev/sdg1

   4       8       49        -      faulty spare   /dev/sdd1
   6       8       80        -      spare   /dev/sdf

After that I have to reassemble the raid with:

sudo mdadm --assemble --force -v /dev/md0 /dev/sdb1 /dev/sdc2 /dev/sdd1 /dev/sde1 /dev/sdf /dev/sdg1

In the server console I´m getting error as you see in the attached picture (link) and then the raid fails and I have to reassemble it:

Raid error

I´m using ubuntu 14.04 and mdadm v3.2.5.

Can someone tell me what the hack is going on here?

Nope the disks are fine, no errors from smart checks. During the recovery I´m able to access all my data, but the raid keeps failing during/after the recovery and I have to repeat the assemble every time... — WhiteIntel, Oct 06 '15 at 10:02
I think the metadata of the raid has a problem, because if you look at the outputs you can see that first 6 devices are active and then are 8 devices in the raid (2 removed, 1 faulty spare, 1 spare) — WhiteIntel, Oct 06 '15 at 10:13
You have corrupt parity data so when you try to rebuild the missing disk it fails. This could be due to your shrinkage, it could just be a write hole. But the result is the same, the array is broken. — JamesRyan, Oct 06 '15 at 11:38

RAID 5 keeps failing after recovery

0 Answers0