Should I be concerned by frequent RAID warning message (unexpected sense - command aborted)?

Question

I have a Fujitsu host (PRIMERGY RX300 S6) running Windows Server 2008 R2 and Hyper-V (Version 6.1) with two virtual servers (one Windows Server 2008 R2 and one Windows Server 2012). Our 25 employees are continually connected to the virtual servers during the workday and read and write files to shared folders.

The storage controller on the host is RAID Ctrl SAS 6G 5/6 512MB (D2616) by LSI Corp.

Recently, I have discovered that the Raid manager displays very frequent warning messages. Nearly one every minute and sometimes up to 15 or 20 a minute.

Each warning message looks like the following:

-------
Event: Warning
Date: Mar 18, 2015, 1:04:49 PM
Source: TOSHIBA MBF2600RC (1:0)
ID: 10909
Event: Adapter FTS RAID Ctrl SAS 6G 5/6 512MB (D2616) (0): Unexpected sense: 
     Disk (1:0), CDB:28 00 1B 02 B5 80 00 00 80 00, Sense:(command aborted)72 0B 4B 04 00 00 00 20 80 1E 00 28 52 08 01 00 50 03 00 57 00 F3 3F 40 50 06 05 B0 00 02 72 BF 00 01 0C 00 00 00 00 00 
------

Unfortunately, I have not been able to find out when this warning message started to occur.

The reason I am somewhat concerned about the warnings - apart from it just looking strange to me - is that Backup Exec has suddenly began to take 3-4 hours longer than usual to complete and now takes around 22-23 hours. Comparing job properties I can see that the job rate of Backup Exec for this particular server is down from around 800 MB/min to 550 MB/min.

My hardware provider has informed me that the message is merely informational, and that we should probably have the server replaced. It is 3,5 years old and I guess we should have it replaced within a year, but I would still like to get to the bottom of this matter.

score 2 · Answer 1 · answered Apr 19 '15 at 20:50

2

The error decode is Bh/4Bh/0Bh = ABORTED_COMMAND/NAK_RECEIVED. I wrote my own tool to decode these and try to give a basic assessment at http://scsi.ev-en.org/

These errors indicate that you have a bad link somewhere, most often it is a bad cable but it can also be a bad port on either side (drive or slot).

answered Apr 19 '15 at 20:50

Baruch Even

1,073
6
18

Thanks. You and your tool are awesome! I will try to replace the cable. – user3225217 Apr 20 '15 at 22:24

score 0 · Answer 2 · answered Mar 18 '15 at 14:11

0

Yes, you should be concerned. Not extremely concerned, but investigate it and - if necessary - replace some parts.

SCSI errors are usually generated by: problems with drivers/firmware or hardware faults.

Refer to: http://en.wikipedia.org/wiki/Key_Code_Qualifier

At minimum though, a SCSI error means 'something went wrong'. This may only be a minor problem, but a frequently occurring minor problem is a rather bigger problem, and means that there's something deeper going wrong.

answered Mar 18 '15 at 14:11

Sobrique

3,747
2
15
36

I'm still having a look - these might well be manufacturer specific. – Sobrique Mar 18 '15 at 14:22
Thanks. I also can't find the interpretation of the exact code anywhere at all, so this is basically my way of investigating it :) My worry is exactly that the problem is slowing me down, which is pretty bad by itself. I should note that device manager says that the driver software for the storage controller is up to date and our hardware supplier has informed me that firmware has been updated recently. Also, every once in a while the warning message says "command aborted, no additional sense information" insted of just the usual "command aborted". – user3225217 Mar 18 '15 at 14:30

Should I be concerned by frequent RAID warning message (unexpected sense - command aborted)?

2 Answers2