0

An old Windows 2003 server is running on a 4 disks (each250GB) RAID 5 array, attached to a 3Ware 9500S-4LP unit. This morning I got two alarms from the controller:

- 0x04:0x0025 Cache flush failed; some data lost: unit=0
- 0x04:0x000A Drive error detected: unit=0, port=1

I'd like to replace the drive, however I miss the correct procedure. Should I simply turn off the machine, replace the faulty drive and reboot? Will the controller rebuild the array automatically?

Riccardo
  • 253
  • 1
  • 3
  • 13

1 Answers1

2

Method 1)

  1. Tell the controller to remove the drive.
  2. Remove the drive from the server.
  3. Put in the new drive.
  4. Tell the controller to rescan.
  5. Flag the new drive as a host spare.

Once flagged as the host spare, the card should automatically take it to replace it the old drive and sync it, rebuilding the array.

Take a look at the User Guide or the CLI guide.

Method 2)

Pull the old drive and put in the new one. The raid card should detect the change and rebuild raid with the new drive.

David
  • 3,555
  • 22
  • 17
  • Can this techcnique be used to enlarge the array? For example, as it may be worth it to use larger disks at this time, say I could use a new 500GB disk, I could replace the disks one by one (once the rebuild has completed per each drive). Once the 4 disks have been replaced, can the controller be instructed to enlarge the array (currently 650GB out of 4x250GB) – Riccardo Mar 05 '13 at 10:22
  • BTW in the user manual I see that as for RAID 5 "Requires a minimum of three drives". This would mean that the array should be working seamlessly with 3 drives, simply removing the defective one? Of course this would mean lowering reliability – Riccardo Mar 05 '13 at 13:00
  • Yes the 3BM interface states that the unit is degraded and that a drive is "NOT IN USE". It states that the unit will work however no more being fault tolerant. It is not clear how to remove the faulty drive...from the 3BM interface it is possible to delete the unit, not possible to remove a drive – Riccardo Mar 05 '13 at 14:48
  • I think I will unplug the broken drive and replace with a new one. Will try to install a larger drive – Riccardo Mar 05 '13 at 15:25
  • I have added the new drive. During rebuild, another drive is throwing ECC errors! OMG! Collapsing? – Riccardo Mar 06 '13 at 13:46
  • While rebuilding the array using the new drive, the system rebooted and now a second drive is broken. Impossible to boot (2 disks Ok, 1 rebuilding, one damaged).... What can I say? 2 disks collpsed all of a sudden in 2 days after 7 years of service – Riccardo Mar 06 '13 at 16:41
  • You can add larger drives with no problem. The raid card will only use what it needs. Raid5 with 3 drives can survive 1 failed drive. When you replace the failed drive, the card will use the data from other two drives to sync the replacement drive. – David Mar 14 '13 at 02:46
  • 7 years is a long time for a drive in a production environment. I've heard of multiple cases (and had one myself) where once one drive fails the others closely follow. This is usually due to the extra load put on the old drives by the raid rebuild process. – David Mar 14 '13 at 02:51
  • If you have a free drive bay, you should think about putting in a hot spare. That way if a drive fails, the raid card will automatically use it to replace the dead drive. This allows you to swap the dead drive with little impact on the system and your raid get's rebuilt faster. – David Mar 14 '13 at 02:53
  • I know the horse has left the barn, but it seems like your going to be starting from scratch, so I suggest going with raid5+hot spare or raid6. Raid10 is another option, but I would only use it where warrented. – David Mar 14 '13 at 02:56
  • After cooling the system, I could boot up again the server. 3dm2 will fire alarms randomly, depending on the disk/area being accessed.I have tried to add a new disk but when 95% of the process is reached, it will halt. I will definitely start from scratch. This time with a RAID 1 option; RAID 5 needs more than I thought for my requirements – Riccardo Mar 14 '13 at 06:53