Should I worry about the integrity of my linux software RAID5 after a crash or kernel panic?

Question

I have a dual core Intel i5 Ubuntu Server 10.04 LTS system running kernel 2.6.32-22-server #33-Ubuntu SMP with three 1TB SATA hard drives set up in a RAID5 array using linux md devices. I have read about the RAID5 write hole and am concerned: if my linux system locks up or kernel panics, should I be assume that the integrety of my data has been compromised and restore from backup? How can I know if the data on the RAID5 array is "safe"?

EDIT: Output of mdadm --detail:

root@chef:/var/lib/vmware# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90
  Creation Time : Thu May 27 04:03:01 2010
     Raid Level : raid5
     Array Size : 1953521536 (1863.02 GiB 2000.41 GB)
  Used Dev Size : 976760768 (931.51 GiB 1000.20 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Jun  7 19:12:07 2010
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 34bc9cc3:02783ea4:65f2b931:77c8854b
         Events : 0.688611

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1

You can check the status of the RAID device using mdadm. What does mdadm --detail /dev/md0 show? (assuming /dev/md0 is the RAID device) Should show State: Clean along with other information. — Richard Holloway, Jun 07 '10 at 22:54
@Richard: I added the output of `mdadm --detail`. But isn't the issue with RAID5 that if the system crashes before parity information can be calculated, that the parity will be out-of-date and a later rebuild (i.e. if a drive fails) will introduce corruption? — Josh, Jun 07 '10 at 23:14
What I was suggesting is that you "know if the data is safe" if you can see State:Clean . You will need to wait for the RAID to rebuild first. — Richard Holloway, Jun 08 '10 at 14:04
@Richard: So at this point `mdadm` does say `State: clean`. Does that mean the parity information is correct? I am specifically concerned about the RAID5 write hole and am looking for an answer on that point... — Josh, Jun 08 '10 at 14:14

grufftech · Answer 1 · 2010-06-07T22:35:34.127

5

You should probably be more concerned as to why your system crashed or Kernel Panicked.

Raid Cards these days do an extremely good job at using cache to its advantage and this significantly reduces the likely hood of a "hole." If it was something in particular i was paranoid about, I'd setup a tripwire-like system (see link below) for detecting corruption in my key files.

As for actually testing for corruption, http://linas.org/linux/raid.html Most of the tools listed on that website under "General System Corruption" should do the trick for 99% of corruption.

edited Jun 07 '10 at 22:35

answered Jun 07 '10 at 22:27

grufftech

6,760
4
37
37

Thanks. I will be posting a second question about why the system locked up -- but I believe it was a one time thing. The system does not have a hardware RAID card, which is why I was concerned about the RAID5 write hole. – Josh Jun 07 '10 at 22:38
Ahhh, Software raid is another story. I don't have any negative experiences with software raid myself, However I don't trust it because its at the OS level which is susceptible to a whole other level of problems. I strongly suggest getting a hardware raid card if the data is mission critical. – grufftech Jun 07 '10 at 22:43
Thanks. Hardware raid is not in the budget for this server at this point, so I'm resorting to rigorous backups. That's why I asked about how much I can trust software RAID5. The link you provided was very useful. Thanks! – Josh Jun 07 '10 at 22:47

Should I worry about the integrity of my linux software RAID5 after a crash or kernel panic?

1 Answers1

Linked