Filesystem set to read-only mode, which drive is faulty?

Question

I woke up today to find out my CentOS server's filesystem has been set to read-only mode. I'm running RAID 1 on this server.

# mkdir test
mkdir: cannot create directory `test': Read-only file system

I did some research and found that this is usually caused by a hardware issue, i.e. the hard drive is about to fail.

How do I find out for sure that it is a hardware issue and not a software issue?

In the case that it is a hardware issue: How do I find out which one of the two drives is faulty and needs to be replaced? Smartctl shows "PASSED" for both drives, although one is showing 678 reallocated sectors and the other one is showing 33 reallocated sectors. (Again, I'm using RAID 1)

dmesg output

ata2.00: exception Emask 0x0 SAct 0x4000000 SErr 0x0 action 0x0
ata2.00: irq_stat 0x40000008
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/08:d0:58:11:38/00:00:01:00:00/40 tag 26 ncq 4096 in
         res 51/40:02:5e:11:38/00:00:01:00:00/40 Emask 0x409 (media error) <F>
ata2.00: status: { DRDY ERR }
ata2.00: error: { UNC }
ata2.00: configured for UDMA/133
sd 1:0:0:0: [sdb]
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 1:0:0:0: [sdb]
Sense Key : Medium Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
        01 38 11 5e
sd 1:0:0:0: [sdb]
Add. Sense: Unrecovered read error - auto reallocate failed
sd 1:0:0:0: [sdb] CDB:
Read(16): 88 00 00 00 00 00 01 38 11 58 00 00 00 08 00 00
end_request: I/O error, dev sdb, sector 20451678
EXT3-fs error (device md2): ext3_get_inode_loc: unable to read inode block - inode=637820, block=2555947
ata2: EH complete
Aborting journal on device md2.
EXT3-fs (md2): error: ext3_journal_start_sb: Detected aborted journal
EXT3-fs (md2): error: remounting filesystem read-only
EXT3-fs (md2): error: remounting filesystem read-only
__journal_remove_journal_head: freeing b_committed_data
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_committed_data
ata2.00: exception Emask 0x0 SAct 0x1000006 SErr 0x0 action 0x0
ata2.00: irq_stat 0x40000008
ata2.00: failed command: READ FPDMA QUEUED
ata2.00: cmd 60/08:c0:58:11:38/00:00:01:00:00/40 tag 24 ncq 4096 in
         res 51/40:02:5e:11:38/00:00:01:00:00/40 Emask 0x409 (media error) <F>
ata2.00: status: { DRDY ERR }
ata2.00: error: { UNC }
ata2.00: configured for UDMA/133
sd 1:0:0:0: [sdb]
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 1:0:0:0: [sdb]
Sense Key : Medium Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
        01 38 11 5e
sd 1:0:0:0: [sdb]
Add. Sense: Unrecovered read error - auto reallocate failed
sd 1:0:0:0: [sdb] CDB:
Read(16): 88 00 00 00 00 00 01 38 11 58 00 00 00 08 00 00
end_request: I/O error, dev sdb, sector 20451678
EXT3-fs error (device md2): ext3_get_inode_loc: unable to read inode block - inode=637807, block=2555947
ata2: EH complete

You use Software RAID? Try to check `dmesg` and check `cat /proc/mdstat` output. — Alexander Tolkachev, Jun 23 '17 at 15:45
I have edited in the output of dmesg command. Is it showing sdb as the faulty drive? (sda is the one with 678 reallocated sectors) — Elite_Dragon1337, Jun 23 '17 at 16:24

score 0 · Answer 1 · answered Jun 23 '17 at 21:49

Your disk sdb is dying and Unrecovered read error - auto reallocate failed error mean that there is no possibility to reallocate bad blocks and disk could die at any time. In our cases disk with such errors in dmesg will die in nearest future(one or, maximum, two month). Also if your disks have growing Reallocated_Sector_Ct in SMART, that mean your disk in prefail state and you should warn about it replacement.

Filesystem set to read-only mode, which drive is faulty?

1 Answers1