1

I had a 4 disk RAID 10 configuration (as 2 logical arrays) on one of my servers and noticed that the server was down. When I reboot it, one the disks was missing.

Talked with the datacenter and they have replaced the faulty disk because it was completely dead. I was hoping that the RAID card will accept the disk and rebuild the array automatically but didn't happened. Checked that if the automatic failover feature was on -which it was- and already tried to initialise the disk but still no luck.

I am not sure if I am doing something wrong, so need to get some advices. What's the process and how can I check if the problem with the RAID array or with the disk. Now I can't rebuild the array neither with Adaptec Card Utility nor with arrconf tool.

before

root@rescue ~ # arcconf GETCONFIG 1 LD
Controllers found: 1
----------------------------------------------------------------------
Logical device information
----------------------------------------------------------------------
Logical device number 0
    Logical device name                      : ESXi
    RAID level                               : 10
    Status of logical device                 : Failed
    Size                                     : 1494016 MB
    Stripe-unit size                         : 256 KB
    Read-cache mode                          : Enabled
    MaxCache preferred read cache setting    : Disabled
    MaxCache read cache setting              : Disabled
    Write-cache mode                         : Disabled (write-through)
    Write-cache setting                      : Disabled (write-through)
    Partitioned                              : Unknown
    Protected by Hot-Spare                   : No
    Bootable                                 : Yes
    Failed stripes                           : No
    Power settings                           : Disabled
    --------------------------------------------------------
    Logical device segment information
    --------------------------------------------------------
    Group 0, Segment 0                       : Missing
    Group 0, Segment 1                       : Present (Controller:1,Connector:0,Device:0)             9VS4DAWW
    Group 1, Segment 0                       : Present (Controller:1,Connector:0,Device:2)             9VS4C646
    Group 1, Segment 1                       : Present (Controller:1,Connector:0,Device:3)             9VS4C6Z6

Logical device number 1
    Logical device name                      :
    RAID level                               : 10
    Status of logical device                 : Failed
    Size                                     : 1362942 MB
    Stripe-unit size                         : 256 KB
    Read-cache mode                          : Enabled
    MaxCache preferred read cache setting    : Disabled
    MaxCache read cache setting              : Disabled
    Write-cache mode                         : Disabled (write-through)
    Write-cache setting                      : Disabled (write-through)
    Partitioned                              : Unknown
    Protected by Hot-Spare                   : No
    Bootable                                 : No
    Failed stripes                           : No
    Power settings                           : Disabled
    --------------------------------------------------------
    Logical device segment information
    --------------------------------------------------------
    Group 0, Segment 0                       : Missing
    Group 0, Segment 1                       : Present (Controller:1,Connector:0,Device:0)             9VS4DAWW
    Group 1, Segment 0                       : Present (Controller:1,Connector:0,Device:2)             9VS4C646
    Group 1, Segment 1                       : Present (Controller:1,Connector:0,Device:3)             9VS4C6Z6


    root@rescue ~ # arcconf GETCONFIG 1 PD
Controllers found: 1
----------------------------------------------------------------------
Physical Device information
----------------------------------------------------------------------
     Device #0
        Device is a Hard drive
        State                              : Online
        Supported                          : Yes
        Transfer Speed                     : SATA 3.0 Gb/s
        Reported Channel,Device(T:L)       : 0,0(0:0)
        Reported Location                  : Connector 0, Device 0
        Vendor                             :
        Model                              : ST31500341AS
        Firmware                           : CC1H
        Serial number                      : 9VS4DAWW
        Size                               : 1430799 MB
        Write Cache                        : Enabled (write-back)
        FRU                                : None
        S.M.A.R.T.                         : No
        S.M.A.R.T. warnings                : 0
        Power State                        : Full rpm
        Supported Power States             : Full rpm,Powered off
        SSD                                : No
        MaxCache Capable                   : No
        MaxCache Assigned                  : No
        NCQ status                         : Enabled
     Device #1
        Device is a Hard drive
        State                              : Online
        Supported                          : Yes
        Transfer Speed                     : SATA 3.0 Gb/s
        Reported Channel,Device(T:L)       : 0,2(2:0)
        Reported Location                  : Connector 0, Device 2
        Vendor                             :
        Model                              : ST31500341AS
        Firmware                           : CC1H
        Serial number                      : 9VS4C646
        Size                               : 1430799 MB
        Write Cache                        : Enabled (write-back)
        FRU                                : None
        S.M.A.R.T.                         : No
        S.M.A.R.T. warnings                : 0
        Power State                        : Full rpm
        Supported Power States             : Full rpm,Powered off
        SSD                                : No
        MaxCache Capable                   : No
        MaxCache Assigned                  : No
        NCQ status                         : Enabled
     Device #2
        Device is a Hard drive
        State                              : Online
        Supported                          : Yes
        Transfer Speed                     : SATA 3.0 Gb/s
        Reported Channel,Device(T:L)       : 0,3(3:0)
        Reported Location                  : Connector 0, Device 3
        Vendor                             :
        Model                              : ST31500341AS
        Firmware                           : CC1H
        Serial number                      : 9VS4C6Z6
        Size                               : 1430799 MB
        Write Cache                        : Enabled (write-back)
        FRU                                : None
        S.M.A.R.T.                         : No
        S.M.A.R.T. warnings                : 0
        Power State                        : Full rpm
        Supported Power States             : Full rpm,Powered off
        SSD                                : No
        MaxCache Capable                   : No
        MaxCache Assigned                  : No
        NCQ status                         : Enabled

after (fixed, paste the wrong one before)

root@rescue ~ # arcconf GETCONFIG 1 LD
Controllers found: 1
----------------------------------------------------------------------
Logical device information
----------------------------------------------------------------------
Logical device number 0
    Logical device name                      : ESXi
    RAID level                               : 10
    Status of logical device                 : Failed
    Size                                     : 1494016 MB
    Stripe-unit size                         : 256 KB
    Read-cache mode                          : Enabled
    MaxCache preferred read cache setting    : Disabled
    MaxCache read cache setting              : Disabled
    Write-cache mode                         : Disabled (write-through)
    Write-cache setting                      : Disabled (write-through)
    Partitioned                              : Unknown
    Protected by Hot-Spare                   : No
    Bootable                                 : Yes
    Failed stripes                           : No
    Power settings                           : Disabled
    --------------------------------------------------------
    Logical device segment information
    --------------------------------------------------------
    Group 0, Segment 0                       : Missing
    Group 0, Segment 1                       : Present (Controller:1,Connector:0,Device:0)             9VS4DAWW
    Group 1, Segment 0                       : Present (Controller:1,Connector:0,Device:2)             9VS4C646
    Group 1, Segment 1                       : Present (Controller:1,Connector:0,Device:3)             9VS4C6Z6

Logical device number 1
    Logical device name                      :
    RAID level                               : 10
    Status of logical device                 : Failed
    Size                                     : 1362942 MB
    Stripe-unit size                         : 256 KB
    Read-cache mode                          : Enabled
    MaxCache preferred read cache setting    : Disabled
    MaxCache read cache setting              : Disabled
    Write-cache mode                         : Disabled (write-through)
    Write-cache setting                      : Disabled (write-through)
    Partitioned                              : Unknown
    Protected by Hot-Spare                   : No
    Bootable                                 : No
    Failed stripes                           : No
    Power settings                           : Disabled
    --------------------------------------------------------
    Logical device segment information
    --------------------------------------------------------
    Group 0, Segment 0                       : Missing
    Group 0, Segment 1                       : Present (Controller:1,Connector:0,Device:0)             9VS4DAWW
    Group 1, Segment 0                       : Present (Controller:1,Connector:0,Device:2)             9VS4C646
    Group 1, Segment 1                       : Present (Controller:1,Connector:0,Device:3)             9VS4C6Z6



Command completed successfully.
root@rescue ~ # arcconf GETCONFIG 1 PD
Controllers found: 1
----------------------------------------------------------------------
Physical Device information
----------------------------------------------------------------------
        Device #0
            Device is a Hard drive
            State                              : Online
            Supported                          : Yes
            Transfer Speed                     : SATA 3.0 Gb/s
            Reported Channel,Device(T:L)       : 0,0(0:0)
            Reported Location                  : Connector 0, Device 0
            Vendor                             :
            Model                              : ST31500341AS
            Firmware                           : CC1H
            Serial number                      : 9VS4DAWW
            Size                               : 1430799 MB
            Write Cache                        : Enabled (write-back)
            FRU                                : None
            S.M.A.R.T.                         : Yes
            S.M.A.R.T. warnings                : 3
            Power State                        : Full rpm
            Supported Power States             : Full rpm,Powered off
            SSD                                : No
            MaxCache Capable                   : No
            MaxCache Assigned                  : No
            NCQ status                         : Enabled
        Device #1
            Device is a Hard drive
            State                              : Ready
            Supported                          : Yes
            Transfer Speed                     : SATA 3.0 Gb/s
            Reported Channel,Device(T:L)       : 0,1(1:0)
            Reported Location                  : Connector 0, Device 1
            Vendor                             :
            Model                              : SAMSUNG HD154UI
            Firmware                           : 1AG01118
            Serial number                      : S1Y6J90B202833
            Size                               : 1430799 MB
            Write Cache                        : Enabled (write-back)
            FRU                                : None
            S.M.A.R.T.                         : No
            S.M.A.R.T. warnings                : 0
            Power State                        : Full rpm
            Supported Power States             : Full rpm,Powered off,Reduced rpm
            SSD                                : No
            MaxCache Capable                   : No
            MaxCache Assigned                  : No
            NCQ status                         : Enabled
        Device #2
            Device is a Hard drive
            State                              : Online
            Supported                          : Yes
            Transfer Speed                     : SATA 3.0 Gb/s
            Reported Channel,Device(T:L)       : 0,2(2:0)
            Reported Location                  : Connector 0, Device 2
            Vendor                             :
            Model                              : ST31500341AS
            Firmware                           : CC1H
            Serial number                      : 9VS4C646
            Size                               : 1430799 MB
            Write Cache                        : Enabled (write-back)
            FRU                                : None
            S.M.A.R.T.                         : No
            S.M.A.R.T. warnings                : 0
            Power State                        : Full rpm
            Supported Power States             : Full rpm,Powered off
            SSD                                : No
            MaxCache Capable                   : No
            MaxCache Assigned                  : No
            NCQ status                         : Enabled
        Device #3
            Device is a Hard drive
            State                              : Online
            Supported                          : Yes
            Transfer Speed                     : SATA 3.0 Gb/s
            Reported Channel,Device(T:L)       : 0,3(3:0)
            Reported Location                  : Connector 0, Device 3
            Vendor                             :
            Model                              : ST31500341AS
            Firmware                           : CC1H
            Serial number                      : 9VS4C6Z6
            Size                               : 1430799 MB
            Write Cache                        : Enabled (write-back)
            FRU                                : None
            S.M.A.R.T.                         : No
            S.M.A.R.T. warnings                : 0
            Power State                        : Full rpm
            Supported Power States             : Full rpm,Powered off
            SSD                                : No
            MaxCache Capable                   : No
            MaxCache Assigned                  : No
            NCQ status                         : Enabled


Command completed successfully.
root@rescue ~ #
Tim
  • 141
  • 2
  • 6
  • My experiences with Adaptec RAID cards have been horrible - dropping entire arrays, not rebuilding after a disk failure, just plain not working... so now I avoid them like they're radioactive. Not sure how helpful that is for you, but it's entirely plausible that you've done nothing wrong, and the problem is with the RAID card instead. – HopelessN00b Dec 01 '14 at 16:30

1 Answers1

0

Not particularly familiar with adaptec, but with most RAID controllers only a disk designated as hot-spare is automatically used to rebuild an array after failure of one of the active drives.

Replacing a failed disk with a new one normally does not automatically trigger an array rebuild. That requires administrator input.

A quick glance in the manual indicates you'll need to do something like:

                  -------  Controller # 
                 |  -----  Channel #  : from reported location
                 | |  ---  Device #   : from reported location
                 | | |  -  set status : RBL for rebuild
                 | | | |
 HRCONF SETSTATE 1 0 1 RBL 
HBruijn
  • 77,029
  • 24
  • 135
  • 201
  • Tried the same with ARCCONF but getting "The candidate spare cannot protect logical device 0, candidate spare too small." for the first LOGICALDRIVE which is kind of meaningless. I guess it's trying to prepare the disk for the whole LD#0 which is bigger than the a single disk on the array. – Tim Dec 01 '14 at 17:17