0

I was able to start a broken raid 1 system with one available and one broken disk by issuing the following command.

mdadm --assemble --force /dev/md9 /dev/sda1 /dev/sda2

I was able to copy the VMWare image from the disk and repair it with the VMWare command

vmkfstools -x repair /path/to/image.vmdk

in order to mount it on a ESXi. The image was converted from GSX to ESXi format after the repair.

I was able to mount the disk (the /dev/sdb1 partition) in a fresh Ubuntu installation but while trying to rescue /var/www and issuing ls -al I get the following output.

brokern file system

The command fsck -y /dev/sdb1 did not report any failures.

The command fdisk -l /dev/sdb reports the following.

enter image description here

What can I do to get the data from /var/www?

UPDATE 1:

Running e2fsck -f -y /dev/sdb1 started to repair a lot of failures. I however doubt that this will get me my data back.

UPDATE 2:

After running e2fsck -f -y /dev/sdb1 there is absolutely no data in /var/www and lots of files with generated numeric file names are now in lost+found folder.

Are there any options out of this horrible accident?

Tony Stark
  • 382
  • 1
  • 5
  • 17
  • 3
    Restore from backup. – Michael Hampton Jun 11 '13 at 17:52
  • 1
    ...Welp. Guess you'll be backing up now. – Nathan C Jun 11 '13 at 19:06
  • You forced an assmble, but did the drives try to re-sync? If not, then individually inspect the contents of each drive. If a resync didn't happen, then perhaps one has a good copy of some data. – Zoredache Jun 11 '13 at 19:07
  • 1
    @TonyStark You performed a dangerous action (forced reassemble) without ensuring you had viable backups - I guess we *do* have to tell you. (And you have to explain to your client why they should have good, working, ***tested*** backups...) – voretaq7 Jun 11 '13 at 19:12
  • Next time, get Jarvis to sort a backup out for them. – tombull89 Jun 11 '13 at 19:31

2 Answers2

3

First off, you messed up by forcing the RAID to assemble. It's likely that one of the disks had a much older version of the data than the other. By forcing it, you told md that both disks contain the same data and to assume they are clean. So md is free to pull any sector off either drive.

The first thing you should have done was take a complete copy of the drives using a tool like dd. Then all of your recovery efforts should have been targeted to that file and not the drives.

It's possible you are too late for that.

Now you have two options.

The first is to send the drives off to a commercial data recovery company such as Kroll OnTrack. This can be expensive. I've had bills from them for anywhere from $250 to $5000. But if your data is worth it, then it's worth it.

If at this point you don't care about any further data loss, then your second option is to attempt recovery yourself using dd. Power down the drives and disconnect the one that was reported as failed earlier. Then boot the server from a rescue CD and use dd to copy the drive to another drive. Be aware that any work you might do on the original drives at this point will only make it harder for a commercial data recovery company to help you out when you decide you're in over your head later.

longneck
  • 23,082
  • 4
  • 52
  • 86
  • One of the disks from the raid 1 was reported to be not available so the raid 1 started with just one disk (which mdadm reported too). In my opinion this was the only way to access the data on the disk. I will try dd as I can't to spend $5000. – Tony Stark Jun 11 '13 at 14:38
1

After running e2fsck -f -y /dev/sdb1 there is absolutely no data in /var/www and lots of files with generated numeric file names are now in lost+found folder.

These would be the "lost" files (inodes with links that should exist, but don't).
fsck "found" them and put them here for you. You should now review them and determine which ones are important.

Yes, this can -- and probably will -- be a huge task, but if you're lucky you'll find the missing files from /var/www in there.
grep is probably going to be your new best friend.

voretaq7
  • 79,879
  • 17
  • 130
  • 214
  • [Additional useful information about `lost+found`](http://unix.stackexchange.com/questions/18154/what-is-the-purpose-of-the-lostfound-folder-in-linux-and-unix) – voretaq7 Jun 11 '13 at 19:27