Question:
Is there a way to force this to rebuild? I'm also toying with the idea of turning the system off and attempting to rebuild it in the 3ware controller BIOS. If I turn this system off in the present state will it come back up or will the arrays be broken and not bootable? Presently the system is up and working.
Details:
Came in to one bad array (degraded) and the other three are initializing. I replaced the bad disk and attempted to rebuild. Using this commands:
./tw_cli /c3/p1 remove
./tw_cli /c3 rescan
./tw_cli maint rebuild c3 u0 p1
RAID array says it's rebuilding but has not moved since I issued the rebuild command.
~ # ./tw_cli /c3/u0 show
Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB)
------------------------------------------------------------------------
u0 RAID-10 REBUILDING 29% - - 256K 1862.61
u0-0 RAID-1 REBUILDING 0% - - - -
u0-0-0 DISK OK - - p0 - 465.651
u0-0-1 DISK DEGRADED - - p1 - 465.651
u0-1 RAID-1 INITIALIZING 62% - - - -
u0-1-0 DISK OK - - p2 - 465.651
u0-1-1 DISK OK - - p3 - 465.651
u0-2 RAID-1 INITIALIZING 40% - - - -
u0-2-0 DISK OK - - p4 - 465.651
u0-2-1 DISK OK - - p5 - 465.651
u0-3 RAID-1 INITIALIZING 16% - - - -
u0-3-0 DISK OK - - p6 - 465.651
u0-3-1 DISK OK - - p7 - 465.651
u0/v0 Volume - - - - - 1862.61
I've attempted to rebuild array with rebuild schedule both enabled and disabled:
~ # ./tw_cli /c3 show rebuild
Rebuild Schedule for Controller /c3
========================================================
Slot Day Hour Duration Status
--------------------------------------------------------
1 Sun 12:00am 24 hr(s) enabled
2 Mon 12:00am 24 hr(s) enabled
3 Tue 12:00am 24 hr(s) enabled
4 Wed 12:00am 24 hr(s) enabled
5 Thu 12:00am 24 hr(s) enabled
6 Fri 12:00am 24 hr(s) enabled
7 Sat 12:00am 24 hr(s) enabled
And I have attempted with the verify schedule both enabled and disabled.
~ # ./tw_cli /c3 show verify
Verify Schedule for Controller /c3
========================================================
Slot Day Hour Duration Status
--------------------------------------------------------
1 Sun 12:00am 24 hr(s) enabled
2 Mon 12:00am 24 hr(s) enabled
3 Tue 12:00am 24 hr(s) enabled
4 Wed 12:00am 24 hr(s) enabled
5 Thu 12:00am 24 hr(s) enabled
6 Fri 12:00am 24 hr(s) enabled
7 Sat 12:00am 24 hr(s) enabled
Also note that attempting to set ignoreECC to on errors out:
~ # ./tw_cli /c3/u0 show ignoreECC
/c3/u0 Ignore ECC policy = off
~ # ./tw_cli /c3/u0 set ignoreECC=on
Setting Ignore ECC Policy on /c3/u0 to [on] ... Failed.
(0x09:0x0005): (0x09:0x0005): Input/output error
Edit 3/15/18:
I figured I'd write up what happened in case anyone else finds themselves in a similar situation. I have to say the stuck initialization is the part that really threw me for a loop. I know some RAID cards resync or verify the arrays once a week. (Or whenever you schedule them to.) I believe what happened is this went to resync and verify the arrays and one or more of the drives failed during the resyncing causing the 'initializing' to stop.
I emailed support for this RAID card. (dcsg.support@broadcom.com
) They looked over the logs and diags and didn't find anything out of the ordinary.
Their suggestion ultimately was to: 'Update the firmware. Reboot after the upgrade. It might help getting it out of the paused state.'
I asked them if it was safe to update the firmware in the 'initializing' state and if they are sure it would be safe to reboot while it's in this state. They never replayed back to that email.
Seeing as I trust no one, I backed up all of the data and rebooted the machine. It came back up with two more bad disks. (They were bad disks on the initializing RAID1 arrays.) Luckily they were all on different RAID1 arrays so I could replace the bad disks. After it rebooted and rebuilt the arrays, they initialized, and everything is now working correctly.
So if you ever see this card stuck at 'initializing' I would backup the data, attempt a reboot, and pray that the bad disks are on different mirrors.
Good luck to all that may read this in the future!