2

we have the following setup in Supermicro server:

  1. LSI 9400 -> expander -> 10 x HDD
  2. LSI 9500 -> expander -> 2 x NVMe
|------------|                             |-----------|
| LSI 9400   |      |--------------| ----->|  HDD x 10 |
|------------| ---->|  Expander    |       |-----------|
                    |              |
|------------| ---->|              |       |-----------|
| LSI 9500   |      |--------------| ----->| NVMe Intel|
|------------|                         |   |-----------|
                                       |
                                       |   |-----------|
                                       |-->| NVMe Intel|
                                           |-----------|

We have no problem blinking any of the bays hosting HDDs, but blinking the NVMe bays does nothing.

I would like to achieve any of these two solutions:

  1. Optimal solution - blink the bays containing the NVMe controller running on 9500 tri-mode
  2. Alternate solution - find a link/value/information that will allow me to associate an NVMe with a physical port on the LSI 9500 controller. I am thinking about something like "Look in the file /<some_path>/<some_file> and there you will find the ID of the port." More complex associations are also welcome. No problem if there are several values we have to corelate.

Operating sytem: Rocky Linux, fully under our control, we can do anything on it, no restrictions. Server configuration: It runs an ESXi with both controllers in passthrough to the Rocky Linux VM.

So far I did the following investigations and experiments.

  1. Try blinking it with ledctl -> no error, no blinking
  2. Try blinking with sg_ses -> no error, no blinking. Here are some commands, trimmed to eliminate the rest of the disk.

Basically, what I want to know is: If a drive fails, which one to remove? The answer can be a blink of a led or running a command that would say "top drive" or something like that.

[root@echo-development ~]# lsscsi -g
[1:0:0:0]    enclosu BROADCOM VirtualSES       03    -          /dev/sg2 
[1:2:0:0]    disk    NVMe     INTEL SSDPE2KX01 01B1  /dev/sdb   /dev/sg3 
[1:2:1:0]    disk    NVMe     INTEL SSDPE2KX02 0131  /dev/sdc   /dev/sg4 

[root@echo-development ~]# sg_ses -vvv --dsn=0 --set=ident /dev/sg2
open /dev/sg2 with flags=0x802
    request sense cmd: 03 00 00 00 fc 00 
      duration=0 ms
    request sense: pass-through requested 252 bytes (data-in) but got 18 bytes
Request Sense near startup detected something:
  Sense key: No Sense, additional: Additional sense: No additional sense information
  ... continue
    Receive diagnostic results command for Configuration (SES) dpage
    Receive diagnostic results cdb: 1c 01 01 ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 60 bytes
    Receive diagnostic results: response:
01 00 00 38 00 00 00 00  11 00 02 24 30 01 62 b2
07 eb 55 80 42 52 4f 41  44 43 4f 4d 56 69 72 74
75 61 6c 53 45 53 00 00  00 00 00 00 30 33 00 00
17 28 00 00 19 08 00 00  00 00 00 00
    Receive diagnostic results command for Enclosure Status (SES) dpage
    Receive diagnostic results cdb: 1c 01 02 ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 208 bytes
    Receive diagnostic results: response:
02 00 00 cc 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
    Receive diagnostic results command for Element Descriptor (SES) dpage
    Receive diagnostic results cdb: 1c 01 07 ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 432 bytes
    Receive diagnostic results: response, first 256 bytes:
07 00 01 ac 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 1c 43 30 2e 30  00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50  4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30  00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50  4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30  00 00 00 00 00 00 00 00
    Receive diagnostic results command for Additional Element Status (SES-2) dpage
    Receive diagnostic results cdb: 1c 01 0a ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 1448 bytes
    Receive diagnostic results: response, first 256 bytes:
0a 00 05 a4 00 00 00 00  16 22 00 00 01 00 00 04
10 00 00 08 50 00 62 b2  07 eb 55 80 3c d2 e4 a6
23 29 01 00 00 00 00 00  00 00 00 00 96 22 00 01
01 00 00 ff 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
96 22 00 02 01 00 00 ff  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 96 22 00 03  01 00 00 ff 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  16 22 00 04 01 00 00 06
10 00 00 08 50 00 62 b2  07 eb 55 84 3c d2 e4 99
70 1d 01 00 00 00 00 00  00 00 00 00 96 22 00 05
01 00 00 ff 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
96 22 00 06 01 00 00 ff  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
  s_byte=2, s_bit=1, n_bits=1
Applying mask to element status [etc=23] prior to modify then write
    Send diagnostic command page name: Enclosure Control (SES)
    Send diagnostic cdb: 1d 10 00 00 d0 00 
    Send diagnostic parameter list:
02 00 00 cc 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 80 00 02 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
    Send diagnostic timeout: 60 seconds
      duration=0 ms
[root@echo-development ~]# sg_ses -vvv --dsn=6 --set=ident /dev/sg2
open /dev/sg2 with flags=0x802
    request sense cmd: 03 00 00 00 fc 00 
      duration=0 ms
    request sense: pass-through requested 252 bytes (data-in) but got 18 bytes
Request Sense near startup detected something:
  Sense key: No Sense, additional: Additional sense: No additional sense information
  ... continue
    Receive diagnostic results command for Configuration (SES) dpage
    Receive diagnostic results cdb: 1c 01 01 ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 60 bytes
    Receive diagnostic results: response:
01 00 00 38 00 00 00 00  11 00 02 24 30 01 62 b2
07 eb 55 80 42 52 4f 41  44 43 4f 4d 56 69 72 74
75 61 6c 53 45 53 00 00  00 00 00 00 30 33 00 00
17 28 00 00 19 08 00 00  00 00 00 00
    Receive diagnostic results command for Enclosure Status (SES) dpage
    Receive diagnostic results cdb: 1c 01 02 ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 208 bytes
    Receive diagnostic results: response:
02 00 00 cc 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
    Receive diagnostic results command for Element Descriptor (SES) dpage
    Receive diagnostic results cdb: 1c 01 07 ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 432 bytes
    Receive diagnostic results: response, first 256 bytes:
07 00 01 ac 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 1c 43 30 2e 30  00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50  4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30  00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50  4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30  00 00 00 00 00 00 00 00
    Receive diagnostic results command for Additional Element Status (SES-2) dpage
    Receive diagnostic results cdb: 1c 01 0a ff fc 00 
      duration=0 ms
    Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 1448 bytes
    Receive diagnostic results: response, first 256 bytes:
0a 00 05 a4 00 00 00 00  16 22 00 00 01 00 00 04
10 00 00 08 50 00 62 b2  07 eb 55 80 3c d2 e4 a6
23 29 01 00 00 00 00 00  00 00 00 00 96 22 00 01
01 00 00 ff 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
96 22 00 02 01 00 00 ff  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 96 22 00 03  01 00 00 ff 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  16 22 00 04 01 00 00 06
10 00 00 08 50 00 62 b2  07 eb 55 84 3c d2 e4 99
70 1d 01 00 00 00 00 00  00 00 00 00 96 22 00 05
01 00 00 ff 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
96 22 00 06 01 00 00 ff  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
  s_byte=2, s_bit=1, n_bits=1
Applying mask to element status [etc=23] prior to modify then write
    Send diagnostic command page name: Enclosure Control (SES)
    Send diagnostic cdb: 1d 10 00 00 d0 00 
    Send diagnostic parameter list:
02 00 00 cc 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 80 00 02 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00  01 00 00 00 01 00 00 00
    Send diagnostic timeout: 60 seconds
      duration=0 ms

We used dns=0 and dns=6 becuase it seems like end devices are connected to these two ports (output trim to relevant results):

[root@echo-development ~]# sg_ses -j /dev/sg2
  BROADCOM  VirtualSES  03
  Primary enclosure logical identifier (hex): 300162b207eb5580
[0,-1]  Element type: Array device slot
  Enclosure Status:
    Predicted failure=0, Disabled=0, Swap=0, status: Unsupported
    OK=0, Reserved device=0, Hot spare=0, Cons check=0
    In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
    App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
    Ready to insert=0, RMV=0, Ident=0, Report=0
    App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
    Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0


[0,0]  Element type: Array device slot
  Enclosure Status:
    Predicted failure=0, Disabled=0, Swap=1, status: Unsupported
    OK=0, Reserved device=0, Hot spare=0, Cons check=0
    In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
    App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
    Ready to insert=0, RMV=0, Ident=0, Report=0
    App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
    Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0
  Additional Element Status:
    Transport protocol: SAS
    number of phys: 1, not all phys: 0, device slot number: 4
    phy index: 0
      SAS device type: end device
      initiator port for:
      target port for: SSP
      attached SAS address: 0x500062b207eb5580
      SAS address: 0x3cd2e4dd23290100
      phy identifier: 0x0



[0,4]  Element type: Array device slot
  Enclosure Status:
    Predicted failure=0, Disabled=0, Swap=1, status: Unsupported
    OK=0, Reserved device=0, Hot spare=0, Cons check=0
    In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
    App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
    Ready to insert=0, RMV=0, Ident=0, Report=0
    App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
    Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0
  Additional Element Status:
    Transport protocol: SAS
    number of phys: 1, not all phys: 0, device slot number: 6
    phy index: 0
      SAS device type: end device
      initiator port for:
      target port for: SSP
      attached SAS address: 0x500062b207eb5584
      SAS address: 0x3cd2e4a623290100
      phy identifier: 0x0

  1. Find the SAS address from the output above in the list of drives. Our SAS address: 0x3cd2e4a623290100 should be found on a drive (NVMe, SSD, HDD, whatever). At least as I understood from sg_ses documentation and blog posts / forums from the Internet. But the SAS address on the NVMes are different, and the indicated SAS address from the controller cannot be found on any devices.
[root@echo-development ~]# cat "/sys/bus/pci/devices/0000:04:00.0/host1/target1:2:1/1:2:1:0/sas_address" 
0x00012923a6e4d25c
  1. Rely on HCTL -> does not work because HCTL changes after when I remove/reinsert a drive to the bay. It also resets on reboot to 1:2:0:0 and 1:2:1:0.
  2. Associate /sys/bus/pci/devices/0000:04:00.0/host1/target1:2:1/1:2:1:0/sas_device_handle with a port on the controller. -> does not work, it increments every time a device is removed and reinserted.
  3. Try to find any other associations between an NVMe drive and the controller port. -> I couldn't find.

Please let me know if there is anything else I could try or if you need any further information.

Patkos Csaba
  • 205
  • 1
  • 2
  • 8

1 Answers1

1

The storcli utility from Broadcom now works with controllers in IT/HBA mode and it can identify the relation between device and physical port.

Patkos Csaba
  • 205
  • 1
  • 2
  • 8