I've got an Ubuntu system with a bunch of hard disks in it acting as my home router, DHCP server, file server, etc. Twice in the past 24 hours it has suddenly decided to set the root filesystem to read only. I think there's a hardware failure on one of the drives. I've ordered a new drive just to be safe.
Jul 8 07:40:54 monolith kernel: [ 42.851001] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Jul 8 07:40:54 monolith kernel: [ 42.851047] ata3.00: BMDMA stat 0x24
Jul 8 07:40:54 monolith kernel: [ 42.851089] ata3.00: cmd c8/00:08:67:6a:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Jul 8 07:40:54 monolith kernel: [ 42.851134] ata3.00: status: { DRDY ERR }
Jul 8 07:40:54 monolith kernel: [ 42.851173] ata3.00: error: { UNC }
My main questions are: do you think this is indicating incipient hard drive failure? I looked at smartctl but I'm not really sure what I'm looking for.
Also, is there a way to figure out which /dev/sd* ata3 corresponds to?
/proc/mdstat says:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sda3[0] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1]
4877654400 blocks level 5, 128k chunk, algorithm 2 [6/6] [UUUUUU]
unused devices: <none>
Which I think looks good.
What would you do if you were in my shoes, staring down a possible RAID failure?