I have problem with one of my Dell PowerEdge R210. Machine is with Centos 6, today system logs started to inform that the hard drive is failing.
Jan 6 03:20:12 centos6 kernel: LSI Debug log info 31080000 for channel 0 id 0 Jan 6 03:20:12 centos6 kernel: sd 0:1:0:0: [sda] Unhandled sense code Jan 6 03:20:12 centos6 kernel: sd 0:1:0:0: [sda] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Jan 6 03:20:12 centos6 kernel: sd 0:1:0:0: [sda] Sense Key : Medium Error [current] Jan 6 03:20:12 centos6 kernel: Info fld=0x21a9055 Jan 6 03:20:12 centos6 kernel: sd 0:1:0:0: [sda] Add. Sense: Unrecovered read error Jan 6 03:20:12 centos6 kernel: sd 0:1:0:0: [sda] CDB: Read(10): 28 00 02 1a 90 20 00 00 38 00 Jan 6 03:22:17 centos6 kernel: mptbase: ioc0: LogInfo(0x31080000): Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000) cb_idx mptscsih_io_done Jan 6 03:22:17 centos6 kernel: LSI Debug log info 31080000 for channel 0 id 0 Jan 6 03:22:17 centos6 kernel: mptbase: ioc0: LogInfo(0x31080000): Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000) cb_idx mptscsih_io_done Jan 6 03:22:17 centos6 kernel: LSI Debug log info 31080000 for channel 0 id 0 Jan 6 03:22:17 centos6 kernel: mptbase: ioc0: LogInfo(0x31080000): Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000) cb_idx mptscsih_io_done Jan 6 03:22:17 centos6 kernel: LSI Debug log info 31080000 for channel 0 id 0 Jan 6 03:22:17 centos6 kernel: mptbase: ioc0: LogInfo(0x31080000): Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000) cb_idx mptscsih_io_done Jan 6 03:22:17 centos6 kernel: LSI Debug log info 31080000 for channel 0 id 0 Jan 6 03:22:17 centos6 kernel: mptbase: ioc0: LogInfo(0x31080000): Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000) cb_idx mptscsih_io_done Jan 6 03:22:17 centos6 kernel: LSI Debug log info 31080000 for channel 0 id 0 Jan 6 03:22:17 centos6 kernel: sd 0:1:0:0: [sda] Unhandled sense code Jan 6 03:22:17 centos6 kernel: sd 0:1:0:0: [sda] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Jan 6 03:22:17 centos6 kernel: sd 0:1:0:0: [sda] Sense Key : Medium Error [current] Jan 6 03:22:17 centos6 kernel: Info fld=0x21a7d89 Jan 6 03:22:17 centos6 kernel: sd 0:1:0:0: [sda] Add. Sense: Unrecovered read error Jan 6 03:22:17 centos6 kernel: sd 0:1:0:0: [sda] CDB: Read(10): 28 00 02 1a 7d 80 00 00 18 00 Jan 6 03:22:19 centos6 kernel: sd 0:1:0:0: [sda] Unhandled sense code Jan 6 03:22:19 centos6 kernel: sd 0:1:0:0: [sda] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Jan 6 03:22:19 centos6 kernel: sd 0:1:0:0: [sda] Sense Key : Medium Error [current] Jan 6 03:22:19 centos6 kernel: Info fld=0x21a7dc0 Jan 6 03:22:19 centos6 kernel: sd 0:1:0:0: [sda] Add. Sense: Unrecovered read error Jan 6 03:22:19 centos6 kernel: sd 0:1:0:0: [sda] CDB: Read(10): 28 00 02 1a 7d c0 00 00 80 00 Jan 6 03:28:05 centos6 kernel: sd 0:1:0:0: [sda] Unhandled sense code Jan 6 03:28:05 centos6 kernel: sd 0:1:0:0: [sda] Result: hostbyte=invalid driverbyte=DRIVER_SENSE Jan 6 03:28:05 centos6 kernel: sd 0:1:0:0: [sda] Sense Key : Medium Error [current] Jan 6 03:28:05 centos6 kernel: Info fld=0x21a7d88 Jan 6 03:28:05 centos6 kernel: sd 0:1:0:0: [sda] Add. Sense: Unrecovered read error Jan 6 03:28:05 centos6 kernel: sd 0:1:0:0: [sda] CDB: Read(10): 28 00 02 1a 7d 88 00 00 08 00 Jan 6 03:28:09 centos6 kernel: sd 0:1:0:0: [sda] Unhandled sense code Jan 6 03:28:09 centos6 kernel: sd 0:1:0:0: [sda] Result: hostbyte=invalid driverbyte=DRIVER_SENSE
Now I assume that this machine has RAID controller but don't know what type of RAID is configured (if there is any).
Output from lspci:
01:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
So this is my question: Is there a way to diagnose that problem without restarting machine, from linux command line? From system level I see only logical drive not hard drives that are connected in RAID which is normally good but now I wanna check if there is RAID and which hard drives are members of this RAID and which hard drive is failing.
EDIT1. For this moment I have only ssh access to this machinse so that's the reason why I want to know if this possible to diagnose this problem via ssh.