2

On an old HP SE316M1R2 (Same machine as the much more common ProLiant DL 160 G6) I have a 4x450GB Raid 5 array handled by the stock P410 Smart Array controller, fw ver. 8.40.30.00. Drives are genuine 10k SAS 450gb 2.5.

One of the drives (Bay 1) is marked as "predictive failure" so I purchased a spare one.

First try was to shutdown and remove the old drive, substitute with new one, turn on. The system did not find the boot drive. I've put the broken drive back in the first bay and the server booted again.

Second try was with the broken drive in place (bay 1) and new one in bay 7. Drive in bay 7 is recognized as raid 0 one drive healty array. I deleted it from logical drives. In the controller's firmware utility I was unable to find any way to either mark bay 1 drive as failed, set bay 7 drive as replacement or set bay 7 drive as hot spare. Even rebooting after deleting the old logical config from bay 7 drive.

I then got into the CLI command interface as last resort and I was unable, typing help, to understand how to work this issue out. No commands seems capable of removing or adding or swapping drives into existing arrays.

I can't belive I should delete the volume and recreate in order to get rid of it. I must be missing something, so:

What is the best practice to swap drives in this scenario?

Note: VMware ESXi 5.5 is installed in this machine.

Edit: I probably figured out what's happening. @ 1st try the drive newly inserted in bay 1 already had its own logical drive configuration. That's why the array was not recognized and the system didn't boot. @ 2nd try I've wiped its configuration. Now that the drive is blank is it possible that if I go on through 1st try again the controller will read array configuration from other drives and boot, and start backgrounded array rebuild? Will update again when I'll be able to schedule another maintenance window for this machine.

Marco
  • 1,709
  • 3
  • 17
  • 31
  • 1
    Do you have hpssacli or hpacucli installed on the hypervisor? You can clarify the issue by adding output of `ctrl all show config` within any of these tools. Hot spare can be assigned by `ctrl slot=X array all add spares=I:Y:Z` where X is your 410 slot and I:Y:Z is drive number in port:box:slot notation as identified in the output of show config command. However, if you receive a boot failure with one drive removed, I suggest your raid configuration is somewhat more complicated than described.. – Peter Zhabin Oct 30 '19 at 20:10
  • I don't think those utilities are installed because it runs a VMware image and not an HP one. BTW I don't have remote access to that system. Thanks for pointing me in the right direction. I will check this out and I will confirm that your comment can be turned into an answer. – Marco Oct 31 '19 at 11:12
  • BTW as an alternative you can boot the server using SmartStart DVD and achieve the same using Array Configuration Utility from there, that'll be nice GUI and no need to alter your ESXi configuration. – Peter Zhabin Oct 31 '19 at 11:31
  • Great. Unfortunately I should try (if still available) downloads for DL160G6 because SE316M1R2 are, [according to HPE forums](https://trackstore.elated-themes.com/), "... special servers, for specific use in specific companies. The drivers and firmware is not available online for these products." I already tried bios updates for this server in the past from 160G6 and they failed. Wish me luck! :) – Marco Oct 31 '19 at 11:43
  • I updated the question with relevant thoughts right from my pillow. – Marco Nov 01 '19 at 08:21
  • Yeah, you're right, P410 silently imports foreign configuration, so if your drive was marked as the single raid-0 it had became volume 0, hence no boot. If you'd pull it again it should rebuild, however I'd rather still recommend installing 5.5 offline bundle and review configuration just to be on the safe side. – Peter Zhabin Nov 01 '19 at 08:46
  • I will follow your advice and install utilities, indeed. But figuring out why something goes well or fails is the funniest part of the job. :) – Marco Nov 01 '19 at 08:48

1 Answers1

1

On SmartArray controllers the simpliest way is to swap hot-swap the drives online, this will start immediate rebuild. You will be able to monitor the rebuild process with hpssacli/hpacucli tool.

You can install these tools on ESXi, it's available as VIB on HPE site.

Be aware that it's common for RAID5 arrays to discover another failed drive during the rebuild and lose the data. Hope you have backups.

Weisskopf
  • 91
  • 3
  • I have a couple of questions regarding your answer, which to be honest I find quite brief. 1. How can I be sure hot swap is supported by both drives and controller? As I said this is an SE316M1R2 and documentation for those Special Editions has never been public on HPE sites. 2. What would happen if a non blank drive is used in the swap process like I did @ first try? Thanks a lot for your contribution. – Marco Nov 24 '19 at 07:11
  • 1. There is no need to look for specific SE documentation if you have regular P410 SmartArray adapter, it should support hot swap. As for drives and backplanes, I never had any problems with hot swap on full hardware SmartArray (not B120/B140) even with cheap consumer SATA disks. 2. If you hot-swap any disk in array the rebuild process should start immediately and all the data on that swapped disk will be erased. Full rebuild will start even if you'll take out healthy drive and put it back shortly. – Weisskopf Nov 25 '19 at 13:40