2

Problem:

I have a CentOS 7.5 server with an array of 4x 3TB WD Reds in hardware RAID 5. The RAID volume has a single EXT4 partition on it. After a recent reboot of the server, the server can no longer mount the partition.

I am looking for advice to get the drive mounted and the data copied off of it. This server was scheduled to be decommissioned in a few months anyway, so root-cause analysis isn't a concern, but data recovery is. I'd also accept answers that simply suggest more research I can do. I don't mind saying I'm at a loss as to how to even approach this, and there's only so many ways I can google "partition not assigned partition number".

Story:

Originally when the server rebooted the system was dropped into the Dracut emergency recovery shell after a "dracut timeout: failed to initialize the filesystem" error. Debugging eventually led me to discover that it was failing to mount the RAID volume. After removing the mount command from /etc/fstab the system booted into the normal shell without a problem, but of course without the RAID volume mounted.

After boot I ran sudo mount /dev/sda1 /data and got the error mount: special device /dev/sda1 does not exist. I followed it up with sudo partprobe and the mount command again, this time getting mount: /dev/sda1 is already mounted or /data busy. (This full workflow is below)

I've confirmed that the RAID controller (Dell PERC H310) can still see the volume and all four drives. The controller's consistency check reported the volume free of errors, and none of the drives are giving SMART errors (according to the controller), so I'm relatively confident I can rule out hardware failure.

Debug workflow:

  1. sudo mount /dev/sda1 /data
mount: special device /dev/sda1 does not exist
  1. lsblk
NAME                                    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                       8:0    0   8.2T  0 disk  
└─36c81f660da98ae001fa50c8c162434f8     253:2    0   8.2T  0 mpath 
  └─36c81f660da98ae001fa50c8c162434f8p1 253:3    0   8.2T  0 part  
sdb                                       8:16   0 111.8G  0 disk  
├─sdb1                                    8:17   0     1G  0 part  /boot
└─sdb2                                    8:18   0 110.8G  0 part  
  ├─centos-root                         253:0    0    50G  0 lvm   /
  ├─centos-swap                         253:1    0  11.2G  0 lvm   [SWAP]
  └─centos-home
  1. sudo partprobe && sudo mount /dev/sda1 /data
mount: /dev/sda1 is already mounted or /data busy
  1. lsblk (run again, after partprobe)
NAME                                    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda                                       8:0    0   8.2T  0 disk  
├─sda1                                    8:1    0   8.2T  0 part  
└─36c81f660da98ae001fa50c8c162434f8     253:2    0   8.2T  0 mpath 
  └─36c81f660da98ae001fa50c8c162434f8p1 253:3    0   8.2T  0 part  
sdb                                       8:16   0 111.8G  0 disk  
├─sdb1                                    8:17   0     1G  0 part  /boot
└─sdb2                                    8:18   0 110.8G  0 part  
  ├─centos-root                         253:0    0    50G  0 lvm   /
  ├─centos-swap                         253:1    0  11.2G  0 lvm   [SWAP]
  └─centos-home
  1. ls -la /data
total 0
0 drwxr-xr-x.  2 root root   6 2018-12-16 22:20 ./
0 dr-xr-xr-x. 19 root root 280 2019-07-10 00:20 ../
  1. lsblk --fs (after running sudo partprobe):
NAME                                    FSTYPE       LABEL UUID                                   MOUNTPOINT
sda                                     mpath_member                                              
├─sda1                                  none               6bad545d-5dee-4699-bb9b-93b526fb5b40   
└─36c81f660da98ae001fa50c8c162434f8                                                               
  └─36c81f660da98ae001fa50c8c162434f8p1 ext4               6bad545d-5dee-4699-bb9b-93b526fb5b40   
sdb                                                                                               
├─sdb1                                  xfs                2302e5fd-d894-49c9-9394-81f148ebe487   /boot
└─sdb2                                  LVM2_member        KbRczx-pSnU-71M1-bZBf-2k80-rFSX-FLdOwx 
  ├─centos-root                         xfs                71becc19-5d85-4801-890c-26da15c7c486   /
  ├─centos-swap                         swap               327d8c5a-d274-4d26-b1eb-d1fae0c2c9fa   [SWAP]
  └─centos-home                         xfs                e988468b-2ce3-4e14-b4de-4a1346b987b7   /home

I assume my problem lies with the fact that the ext4 partition is not actually sda1, I just don't know how to fix that.

enpaul
  • 202
  • 2
  • 13
  • Where did the multipath stuff come from? This isn't a normal setup for a locally attached RAID 5. – Michael Hampton Jul 13 '19 at 03:49
  • @MichaelHampton Sorry, can you clarify what you mean by multipath? – enpaul Jul 13 '19 at 03:51
  • It's right there in your `lsblk` output. The system for some reason thinks that `/dev/sda` is one path to a multipath device. It's obviously not a good idea to try to make a multipath device from local disks, so how did this happen? – Michael Hampton Jul 13 '19 at 04:00
  • I honestly have no idea, it certainly wasn't intentional. Are there logs or more resources I can check to work on finding out? – enpaul Jul 13 '19 at 04:03
  • This may be applicable: https://unix.stackexchange.com/questions/97089/local-disks-detected-as-multipath-device – enpaul Jul 13 '19 at 04:05

1 Answers1

0

The top answer on this question solved this problem.

Thanks to @Micheal Hampton for pointing out the multipath problem!

enpaul
  • 202
  • 2
  • 13