0

I have a Scientific Linux 6.5 server with a RAID with SSD disks. I see several error on dmesg file but there is no alert or errors provided by the utilities of the RAID controller (LSI controller).

The server is a Supermico D20-4x-M4, connected directly using Infiniband to a storage server with the SSD RAID. The controller is LSI MegaRaid SAS 9286CV-8e. And the disk are SAMSUNG MZ7WD120.

The errors of the dmesg files are the following:

# dmesg  
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
__ratelimit: 19 callbacks suppressed
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
__ratelimit: 19 callbacks suppressed
__ratelimit: 19 callbacks suppressed
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Buffer I/O error on device sdb, logical block 0
sd 1:0:0:0: [sdb] Unhandled error code
sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 08 00 00 08 00
Buffer I/O error on device sdb, logical block 1
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 8
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 234441640
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 0
end_request: I/O error, dev sdb, sector 234441640
end_request: I/O error, dev sdb, sector 0

Also I tried with:

smartctl /dev/sdb -a -T permissive

with the following result:

smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-431.20.3.el6.x86_64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

Vendor:               /1:0:0:0
Product:
User Capacity:        600,332,565,813,390,450 bytes [600 PB]
Logical block size:   774843950 bytes
>> Terminate command early due to bad response to IEC mode page

Error Counter logging not supported
Device does not support Self Test logging

-

# lsscsi
[0:0:0:0]    disk    ATA      SAMSUNG MZ7WD120 DXM8  /dev/sda
[1:0:0:0]    disk    ATA      SAMSUNG MZ7WD120 DXM8  /dev/sdb
[6:0:32:0]   enclosu LSI      SAS2X36          0e0b  -
[6:0:33:0]   enclosu LSI      SAS2X36          0e0b  -
[6:2:0:0]    disk    LSI      MR9286CV-8e      3.40  /dev/sdc

Thanks in advance.

jmlero
  • 1
  • 1
  • 2
  • What type of controller are you using? What make/model of server is in use here? Which SSDs are these? Can you post the output of `lsscsi`? – ewwhite Aug 29 '14 at 12:11
  • It doesn't look like the Samsung SSDs are connected to your LSI controller. – ewwhite Aug 29 '14 at 19:19
  • @ewwhite you are right, the disk which are failing are not connected to the LSI controller. Any suggestion or idea to solve this problem? – jmlero Sep 02 '14 at 08:36
  • If the disks are failing, please replace them. – ewwhite Sep 02 '14 at 11:54
  • @ewwhite it is not possible to shutdown the machine at this moment. I was trying to find a way to solve or fix this error without a shutdown of the machine. – jmlero Sep 02 '14 at 12:56

0 Answers0