MD, partially grown from RAID1 to RAID5 but was interrupted, disks removed, and now file system is FUBAR

Question

I know that I made some stupid moves to get into this situation, please don't remind me, and don't ask why :-/

I have this Synology DS1515+, with 2x6TB drives in SHR, which means MD RAID1, with LVM on top.

After starting a RAID1 to RAID5 conversion, aborting it and fiddling around with disk, my ext4 file system is unmountable.

Could it be possible, that the system have simply been "confused" by the many reboots and removal of disks, and now treats the entire disk space as a RAID5 volume, even though the conversion from RAID1 to RAID5 was only about 10% completed? If so, do you think that I have a chance of fixing the file system, if I add a third disk and let the RAID array rebuild? Or will it just rebuld to a logical volume with the exact same data as now, i.e. a damaged file system on it?

I'm a little curious about how the actual conversion process goes on, since MD and/or LVM must know which parts of the block devices that should be treated as RAID5 or RAID1, until the entire space is converted to RAID5. Anyone who knows more about this?

Thanks in advance for any help :-)

Here is what I did. (My rescue attempts so far, and log entries are listed below)

Hot-plugged a new 6 TB disk into the NAS.
Told Synology's UI to add the disk to my existing volume and grow it to 12TB (making it into a 3x6TB RAID5)
Shut down the NAS (shutdown -P now) a couple of yours into the growing process, and removed the new drive. The NAS booted well but reported that my volume was degraded. It still reported a 6 TB file system and everything was still accessible.
Hot-plugged disk 3 again, wiped it and made another single disk volume on it.
Shut down the NAS, removed disk 2 (this was a mistake!) and powered it on. It started beeping and told me that my volume was crashed.
Shut the NAS down again and re-inserted the missing disk2. But the Synology still reported the volume as crashed, and offered no repair options.

So, all my data is now unavailable!

I started investigating the issue. It seems like MD is assembling the array as it should:

 State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : DiskStation:2  (local to host DiskStation)
           UUID : 58290cba:75757ee2:86fe074c:ada2e6d2
         Events : 31429

    Number   Major   Minor   RaidDevice State
       0       8        5        0      active sync   /dev/sda5
       1       8       21        1      active sync   /dev/sdb5
       2       0        0        2      removed

And the metadata on the two original disks also looks fine:

Device Role : Active device 0
Array State : AA. ('A' == active, '.' == missing)

Device Role : Active device 1
Array State : AA. ('A' == active, '.' == missing)

LVM also recognizes the RAID5 volume and expose it's device:

--- Logical volume ---
LV Path                /dev/vg1000/lv
LV Name                lv
VG Name                vg1000

But the file system on /dev/vg1000/lv seems to be damaged, when i try to mount it read-only:

mount: wrong fs type, bad option, bad superblock on /dev/vg1000/lv, missing codepage or helper program, or other error (for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program)  
In some cases useful info is found in syslog - try dmesg | tail or so.

So, here I am with a broken file system, which I believe is not possible to repair (see a list of my attempts below).

Here are the steps that I have tried so far:

Cloned /dev/vg1000/lv to a partition on an empty hard drive and ran e2fsck I had this process running for a week, before I interruppted it. It found million of faulty inodes and multiply-claimed blocks etc. and with that amount of FS errors, I believe that it will not bring back any useful data, even if it completes some day.

Moved the two hard drives with data, into a USB dock, and connected it to a Ubuntu virtual machine, and made overlay devices to catch all writes (using dmsetup)

First, I tried to re-create the raid array. I started by finding the command that created the array with the same parameters and mdadm -E already gave me, and then I tried swithing the order around, to see if the results differed (i.e. sda, missing, sdb, sda, sdb, missing, missing, sdb, sda). 4 out of 6 combinations made LVM detect the volume group, but the file system was still broken.

Used R-Studio to assemble the array and search for file systems

This actually give a few results - it was able to scan and find an EXT4 file system on the RAID volume that I assembled, and I could browse the files, but only a subset (like 10) of my actual files were presented in the file viewer. I tried switching around with the device order, and while 4 of the combinations made R-Studio detect an ext4 file system (just like above), only the original setting (sda, sdb, missing) made R-studio able to discover any files from the root of the drive.

Tried mounting with -o sb=XXXXX, pointing at an alternative superblock

This gave me the same errors as not specifyting the superblock position.

Tried debugfs

This gave me IO errors when I typed "ls"

Here are the log messages for the operations described above, that caused the problems.

Shutting down the system, which were running as a degraded RAID5, with a file system still working.

2017-02-25T18:13:27+01:00 DiskStation umount: kill the process "synoreport" [pid = 15855] using /volume1/@appstore/StorageAnalyzer/usr/syno/synoreport/synoreport
2017-02-25T18:13:28+01:00 DiskStation umount: can't umount /volume1: Device or resource busy
2017-02-25T18:13:28+01:00 DiskStation umount: can't umount /volume1: Device or resource busy
2017-02-25T18:13:28+01:00 DiskStation umount: SYSTEM:   Last message 'can't umount /volume' repeated 1 times, suppressed by syslog-ng on DiskStation
2017-02-25T18:13:28+01:00 DiskStation syno_poweroff_task: lvm_poweroff.c:49 Failed to /bin/umount -f -k /volume1
2017-02-25T18:13:29+01:00 DiskStation syno_poweroff_task: lvm_poweroff.c:58 Failed to /sbin/vgchange -an
2017-02-25T18:13:29+01:00 DiskStation syno_poweroff_task: raid_stop.c:28 Failed to mdadm stop '/dev/md2'
2017-02-25T18:13:29+01:00 DiskStation syno_poweroff_task: syno_poweroff_task.c:331 Failed to stop RAID [/dev/md2]

Remark the "failed to stop RAID" - is that a possible cause of the problems?

First boot after removing disk2 (sdb)

2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.467975] set group disks wakeup number to 5, spinup time deno 1
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.500561] synobios: unload
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.572388] md: invalid raid superblock magic on sda5
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.578043] md: sda5 does not have a valid v0.90 superblock, not importing!
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.627381] md: invalid raid superblock magic on sdc5
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.633034] md: sdc5 does not have a valid v0.90 superblock, not importing!
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.663832] md: sda2 has different UUID to sda1
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.672513] md: sdc2 has different UUID to sda1
2017-02-25T18:15:27+01:00 DiskStation kernel: [   10.784571] Got empty serial number. Generate serial number from product.


2017-02-25T18:15:41+01:00 DiskStation spacetool.shared: raid_allow_rmw_check.c:48 fopen failed: /usr/syno/etc/.rmw.md3
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.339243] md/raid:md2: not enough operational devices (2/3 failed)
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.346371] md/raid:md2: raid level 5 active with 1 out of 3 devices, algorithm 2
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.355295] md: md2: set sda5 to auto_remap [1]
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.355299] md: reshape of RAID array md2
2017-02-25T18:15:41+01:00 DiskStation spacetool.shared: spacetool.c:1223 Try to force assemble RAID [/dev/md2]. [0x2000 file_get_key_value.c:81]
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.414839] md: md2: reshape done.
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.433218] md: md2: set sda5 to auto_remap [0]
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.494964] md: md2: set sda5 to auto_remap [0]
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.549962] md/raid:md2: not enough operational devices (2/3 failed)
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.557093] md/raid:md2: raid level 5 active with 1 out of 3 devices, algorithm 2
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.566069] md: md2: set sda5 to auto_remap [1]
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.566073] md: reshape of RAID array md2
2017-02-25T18:15:41+01:00 DiskStation spacetool.shared: raid_allow_rmw_check.c:48 fopen failed: /usr/syno/etc/.rmw.md2
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.633774] md: md2: reshape done.
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.645025] md: md2: change number of threads from 0 to 1
2017-02-25T18:15:41+01:00 DiskStation kernel: [   31.645033] md: md2: set sda5 to auto_remap [0]
2017-02-25T18:15:41+01:00 DiskStation spacetool.shared: spacetool.c:3023 [Info] Old vg path: [/dev/vg1000], New vg path: [/dev/vg1000], UUID: [Fund9t-vUVR-3yln-QYVk-8gtv-z8Wo-zz1bnF]
2017-02-25T18:15:41+01:00 DiskStation spacetool.shared: spacetool.c:3023 [Info] Old vg path: [/dev/vg1001], New vg path: [/dev/vg1001], UUID: [FHbUVK-5Rxk-k6y9-4PId-cSMf-ztmU-DfXYoL]

2017-02-25T18:22:50+01:00 DiskStation umount: can't umount /volume2: Invalid argument
2017-02-25T18:22:50+01:00 DiskStation syno_poweroff_task: lvm_poweroff.c:49 Failed to /bin/umount -f -k /volume2
2017-02-25T18:22:50+01:00 DiskStation kernel: [  460.374192] md: md2: set sda5 to auto_remap [0]
2017-02-25T18:22:50+01:00 DiskStation kernel: [  460.404747] md: md3: set sdc5 to auto_remap [0]
2017-02-25T18:28:01+01:00 DiskStation umount: can't umount /initrd: Invalid argument

Booting again, with the disk2 (sdb) present again

2017-02-25T18:28:17+01:00 DiskStation spacetool.shared: raid_allow_rmw_check.c:48 fopen failed: /usr/syno/etc/.rmw.md3
2017-02-25T18:28:17+01:00 DiskStation kernel: [   32.442352] md: kicking non-fresh sdb5 from array!
2017-02-25T18:28:17+01:00 DiskStation kernel: [   32.478415] md/raid:md2: not enough operational devices (2/3 failed)
2017-02-25T18:28:17+01:00 DiskStation kernel: [   32.485547] md/raid:md2: raid level 5 active with 1 out of 3 devices, algorithm 2
2017-02-25T18:28:17+01:00 DiskStation spacetool.shared: spacetool.c:1223 Try to force assemble RAID [/dev/md2]. [0x2000 file_get_key_value.c:81]
2017-02-25T18:28:17+01:00 DiskStation kernel: [   32.515567] md: md2: set sda5 to auto_remap [0]
2017-02-25T18:28:18+01:00 DiskStation kernel: [   32.602256] md/raid:md2: raid level 5 active with 2 out of 3 devices, algorithm 2
2017-02-25T18:28:18+01:00 DiskStation spacetool.shared: raid_allow_rmw_check.c:48 fopen failed: /usr/syno/etc/.rmw.md2
2017-02-25T18:28:18+01:00 DiskStation kernel: [   32.654279] md: md2: change number of threads from 0 to 1
2017-02-25T18:28:18+01:00 DiskStation spacetool.shared: spacetool.c:3023 [Info] Old vg path: [/dev/vg1000], New vg path: [/dev/vg1000], UUID: [Fund9t-vUVR-3yln-QYVk-8gtv-z8Wo-zz1bnF]
2017-02-25T18:28:18+01:00 DiskStation spacetool.shared: spacetool.c:3023 [Info] Old vg path: [/dev/vg1001], New vg path: [/dev/vg1001], UUID: [FHbUVK-5Rxk-k6y9-4PId-cSMf-ztmU-DfXYoL]
2017-02-25T18:28:18+01:00 DiskStation spacetool.shared: spacetool.c:3030 [Info] Activate all VG

2017-02-25T18:28:18+01:00 DiskStation synovspace: virtual_space_conf_check.c:78 [INFO] "PASS" checking configuration of virtual space [FCACHE], app: [1]
2017-02-25T18:28:18+01:00 DiskStation synovspace: virtual_space_conf_check.c:74 [INFO] No implementation, skip checking configuration of virtual space [HA]
2017-02-25T18:28:18+01:00 DiskStation synovspace: virtual_space_conf_check.c:74 [INFO] No implementation, skip checking configuration of virtual space [SNAPSHOT_ORG]
2017-02-25T18:28:18+01:00 DiskStation synovspace: vspace_wrapper_load_all.c:76 [INFO] No virtual layer above space: [/volume2] / [/dev/vg1001/lv]
2017-02-25T18:28:18+01:00 DiskStation synovspace: vspace_wrapper_load_all.c:76 [INFO] No virtual layer above space: [/volume1] / [/dev/vg1000/lv]
2017-02-25T18:28:19+01:00 DiskStation kernel: [   33.792601] BTRFS: has skinny extents
2017-02-25T18:28:19+01:00 DiskStation kernel: [   34.009184] JBD2: no valid journal superblock found
2017-02-25T18:28:19+01:00 DiskStation kernel: [   34.014673] EXT4-fs (dm-0): error loading journal
mount: wrong fs type, bad option, bad superblock on /dev/vg1000/lv,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
quotacheck: Mountpoint (or device) /volume1 not found or has no quota enabled.
quotacheck: Cannot find filesystem to check or filesystem not mounted with quota option.
quotaon: Mountpoint (or device) /volume1 not found or has no quota enabled.
2017-02-25T18:28:19+01:00 DiskStation synocheckhotspare: synocheckhotspare.c:149 [INFO] No hotspare config, skip hotspare config check. [0x2000 virtual_space_layer_get.c:98]
2017-02-25T18:28:19+01:00 DiskStation synopkgctl: pkgtool.cpp:3035 package AudioStation is not installed or not operable

Remark how it first says that 1 of 3 devices are present, but afterwards force assembles it, so the RAID array is assembleed, and then tries to mount it but get the EXT4 mounting errors.

Tried to reboot after this experience, did not help

2017-02-25T18:36:45+01:00 DiskStation spacetool.shared: raid_allow_rmw_check.c:48 fopen failed: /usr/syno/etc/.rmw.md3
2017-02-25T18:36:45+01:00 DiskStation kernel: [   29.579136] md/raid:md2: raid level 5 active with 2 out of 3 devices, algorithm 2
2017-02-25T18:36:45+01:00 DiskStation spacetool.shared: raid_allow_rmw_check.c:48 fopen failed: /usr/syno/etc/.rmw.md2
2017-02-25T18:36:45+01:00 DiskStation kernel: [   29.629837] md: md2: change number of threads from 0 to 1
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3023 [Info] Old vg path: [/dev/vg1000], New vg path: [/dev/vg1000], UUID: [Fund9t-vUVR-3yln-QYVk-8gtv-z8Wo-zz1bnF]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3023 [Info] Old vg path: [/dev/vg1001], New vg path: [/dev/vg1001], UUID: [FHbUVK-5Rxk-k6y9-4PId-cSMf-ztmU-DfXYoL]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3030 [Info] Activate all VG
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3041 Activate LVM [/dev/vg1000]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3041 Activate LVM [/dev/vg1001]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3084 space: [/dev/vg1000]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3084 space: [/dev/vg1001]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3110 space: [/dev/vg1000], ndisk: [2]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: spacetool.c:3110 space: [/dev/vg1001], ndisk: [1]
2017-02-25T18:36:46+01:00 DiskStation spacetool.shared: hotspare_repair_config_set.c:36 Failed to hup synostoraged
2017-02-25T18:36:46+01:00 DiskStation synovspace: virtual_space_conf_check.c:78 [INFO] "PASS" checking configuration of virtual space [FCACHE], app: [1]
2017-02-25T18:36:46+01:00 DiskStation synovspace: virtual_space_conf_check.c:74 [INFO] No implementation, skip checking configuration of virtual space [HA]
2017-02-25T18:36:46+01:00 DiskStation synovspace: virtual_space_conf_check.c:74 [INFO] No implementation, skip checking configuration of virtual space [SNAPSHOT_ORG]
2017-02-25T18:36:46+01:00 DiskStation synovspace: vspace_wrapper_load_all.c:76 [INFO] No virtual layer above space: [/volume2] / [/dev/vg1001/lv]
2017-02-25T18:36:46+01:00 DiskStation synovspace: vspace_wrapper_load_all.c:76 [INFO] No virtual layer above space: [/volume1] / [/dev/vg1000/lv]
2017-02-25T18:36:47+01:00 DiskStation kernel: [   30.799110] BTRFS: has skinny extents
2017-02-25T18:36:47+01:00 DiskStation kernel: [   30.956115] JBD2: no valid journal superblock found
2017-02-25T18:36:47+01:00 DiskStation kernel: [   30.961585] EXT4-fs (dm-0): error loading journal
mount: wrong fs type, bad option, bad superblock on /dev/vg1000/lv,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
quotacheck: Mountpoint (or device) /volume1 not found or has no quota enabled.
quo

It's technically possible to fix this, but not without deep inspection of the MD metadata. A professional would charge a LOT of money for that, so definitely backups. — Spooler, Mar 11 '17 at 02:39
Seems like my fear is reasonable. While I have cloud backups of the the most important files, the rest will take some time and manual work to recreate. And also, it seems to take a lot of time to recreate several TB from the cloud, så I was hoping that I could fix this by starting the RAID array with some special arguments. So, if you think it's game over, I COULD try rebuilding the array to 3 disks, but will that change anything to the logical data that my OS can see on the /dev/vg1000/lv device? Or will it just make a "healthy" RAID with the same corrupted data as i have now? — Esben von Buchwald, Mar 11 '17 at 10:35
@SmallLoanOf1M Thanks for mentioning the deep inspection of MD metadata. I had a theory that the first part of the space was converted to RAID5 and the remaining space was still RAID1, and therefore, the file system seemed to be corrupted. I just think i proved that right!!!! — Esben von Buchwald, Mar 11 '17 at 23:57
I investigated the binary format of the MD superblock, and found that there was a section with reshaping status, telling me that about 1.8 TB was already reshaped into RAID5, when 1 device was added. But the feature map have a flag indicating that reshape is in progress, and this flag was not set. I have just tried copying the superblock and editing it with a hex editor, and set the proper flag. Now the mdadm -E shows this!!! Reshape pos'n : 1928186496 (1838.86 GiB 1974.46 GB) Delta Devices : 1 (2->3) — Esben von Buchwald, Mar 12 '17 at 00:02
I'll be setting the bits on a snapshot and see if mdadm starts the reshaping process from where it left - if that seems to go right, I think I'll take the risk and let it reshape the rest of the volume. Meanwhile, I've made regions in R-studio, matching the reshape progress, and made a virtual volume consisting of 3.6 TB data made of RAID5 and 1,8 TB from RAID1.... It's currently scanning for all files. Fingers crossed! — Esben von Buchwald, Mar 12 '17 at 00:04
I honestly wasn't counting on you having that kind of tenacity regarding this, or I would have given you a lot more information on it. You've taken exactly the correct steps. This array will probably be fine now. — Spooler, Mar 12 '17 at 00:13
If it does, you should definitely answer your own question so it can be upvoted and accepted. — Spooler, Mar 12 '17 at 00:14
No success so far :( I've mounted the physical drives in my VM, bade a snapshot overlay, and set the "reshape in progress" bit of the super blocks for sdd5 and sde5, so mdadm -E reports that a reshape is in progress. But now it won't let me assemble the RAID array, so I can't try grow it or anything. I get this message with trying to assemble; # mdadm --assemble --scan mdadm: failed to add /dev/dm-8 to /dev/md/DiskStation:2: Invalid argument mdadm: failed to add /dev/dm-5 to /dev/md/DiskStation:2: Invalid argument mdadm: failed to RUN_ARRAY /dev/md/DiskStation:2: Invalid argument — Esben von Buchwald, Mar 12 '17 at 13:37
--force or --run does not help either. If I flip the "reshape in progress" bit back to 0, it will assemble just fine, but with corrupted data. I also tried scanning the raw sde5 partition for known file types with R-studio, and there seems to be a significant changes in the amount of valid files found, before and after the first 1838.86 GiB of the disk. On the first (RAID5-formatted) part, I found a flac file which - when played - skipped about every odd second, that makes sense to me, if the block size is 64KB and every odd block was missing (because I only scanned one disk) — Esben von Buchwald, Mar 12 '17 at 13:43
But on the part of the disk after 1838.86 GiB, I was able to scan for several large images which looked fine - so that seems to still be in it's original format. However I still haven't managed to find the entire fil system (only a fraction of the file tree, just like before), when creating a virtual volume in R-studio, made from a virtual RAID5 from a region of 1838.86 GiB and a RAID1 made from offset 2x1838.86 GiB till the end of the disk. Have anyone tried this? — Esben von Buchwald, Mar 12 '17 at 13:49
I managed to create the right regions and virtual volumes in R-Studio, and it found all the files on the volume! Right now it's restoring all the stuff that I don't have backed up recently, and after that, I'll clone the virtual volume to a partition on another disk, an try to fsck it, so I hopefully can access the data again :) I'll post an answer with details when this is over :) — Esben von Buchwald, Mar 13 '17 at 14:39

Esben von Buchwald · Accepted Answer · 2022-03-31T06:06:06.277

How I saved my data, after completely breaking a growing RAID5!

I have a 3-disk RAID5 array, with the device number 3 missing, and data which seems to be corrupted.

/dev/sdd5: (5.45 TiB) 6TB, device 1 of the array

/dev/sde5: (5.45 TiB) 6TB, device 2 of the array

The array was in the progress of a conversion from RAID1 to RAID5, when the operation was interrupted and device 3 was removed. The array was still running, until device 2 was also removed. When device 2 was put back, the file system couldn’t be mounted. The /dev/md2 device was cloned, and an fsck was run on the cloned partition, finding millions of errors.

MD was obviously not handling the RAID data properly after the interrupted conversion and removal of disks. I went to investigate what had happened:

First, I took a look at /var/log/space_operation_error.log and it told me exactly what had happened. The RAID changed its status to broken, as soon as Disk 2 was removed, since a 3-disk RAID5 cannot run with 1 disk. But that also made the RAID forget about it's ongoing reshape from RAID1 to RAID5.

Therefore, I thought that the data corruption might be caused by MD treating the entire data as RAID5-encoded, while a part of it was still in its original state.

Examining the RAID data of the devices, did not help me, everything looked fine:

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md124 : active raid5 sda5[0] sdb5[1]
      11711575296 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [UU_]

# mdadm -E /dev/sda5
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 58290cba:75757ee2:86fe074c:ada2e6d2
           Name : DiskStation:2
  Creation Time : Thu Nov 27 11:35:34 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 11711575680 (5584.51 GiB 5996.33 GB)
     Array Size : 23423150592 (11169.03 GiB 11992.65 GB)
  Used Dev Size : 11711575296 (5584.51 GiB 5996.33 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1a222812:ac39920b:4cec73c4:81aa9b63

    Update Time : Fri Mar 17 23:14:25 2017
       Checksum : cb34324c - correct
         Events : 31468

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AA. ('A' == active, '.' == missing)

But I thought that it MUST have some kind of counter, to keep track of its progress while reshaping. I studied the format of the MD superblock, described here: https://raid.wiki.kernel.org/index.php/RAID_superblock_formats

I took a copy of the first 10 MiB of one of the RAID partitions (mdadm -E didn't work on smaller copies):

# dd if=/dev/sda5 of=/volume1/homes/sda5_10M.img bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0622844 s, 168 MB/s

I opened it in a HEX editor, and changed the data at byte 4104 from 0x00 to 0x04, to indicate that reshaping was in progress.

I also noted the value of the 8 bytes starting at 4200. It read 3856372992.

After saving the change, I examined the copy:

# mdadm -E /volume1/homes/sda5_10M.img
/volume1/homes/sda5_10M.img:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4
     Array UUID : 58290cba:75757ee2:86fe074c:ada2e6d2
           Name : DiskStation:2
  Creation Time : Thu Nov 27 11:35:34 2014
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 11711575680 (5584.51 GiB 5996.33 GB)
     Array Size : 23423150592 (11169.03 GiB 11992.65 GB)
  Used Dev Size : 11711575296 (5584.51 GiB 5996.33 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1a222812:ac39920b:4cec73c4:81aa9b63

  Reshape pos'n : 1928186496 (1838.86 GiB 1974.46 GB)
  Delta Devices : 1 (2->3)

    Update Time : Fri Mar 17 23:14:25 2017
       Checksum : cb34324c - expected cb343250
         Events : 31468

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AA. ('A' == active, '.' == missing)

As you can see, it reported the exact position of the reshape progress - which also tells me that the number I got before, was the number of 512-byte sectors.

Now knowing that the first 1838.86 GiB was overwritten during the reshaping, I assumed that the rest of the partitions were untouched.

Therefore, I decided to assemble a block device, from the new RAID5 part and the untouched part, cut at the reported respape position (read the notes below about assuming the position). Since the data offset is 2048 sectors, I need to add 1024KiB to size, to get the offset of the raw partition part:

#losetup -f --show /dev/md124 --sizelimit=1928186496K
/dev/loop0

#losetup -f --show /dev/sda5 --offset=1928187520K 
/dev/loop1

To assemble the parts, I created a JBOD device without metadata:

# mdadm --build --raid-devices=2 --level=linear /dev/md9 /dev/loop0 /dev/loop1
mdadm: array /dev/md9 built and started.

Then I checked the content of the new /dev/md9 device

# file -s /dev/md9
/dev/md9: LVM2 PV (Linux Logical Volume Manager), UUID: xmhBdx-uED6-hN53-HOeU-ONy1-29Yc-VfIDQt, size: 5996326551552

Since the RAID contained an LVM volume, I needed to skip the first 576KiB to get to the ext4 file system:

# losetup -f --show /dev/md9 --offset=576K
/dev/loop2

# file -s /dev/loop2
/dev/loop2: Linux rev 1.0 ext4 filesystem data, UUID=8e240e88-4d2b-4de8-bcaa-0836f9b70bb5, volume name "1.42.6-5004" (errors) (extents) (64bit) (large files) (huge files)

Now I mounted the file system to a shared folder on my NAS:

# mount -o ro,noload /dev/loop2 /volume1/homes/fixraid/

And my files were accessible!

Before I decided the position size/offsets used above, I tried several values. My first idea was that since 1838.86 GiB of each device were reshaped, the RAID5 part would contain ~3.6 TiB of valid data, and I used a position which were the double of the reshape position. It mounted fine, but some of my files seemed to contain invalid data, some files gave I/O errors when reading, and some folders were missing.

Since I had a lot of RAW photos in the NEF (Nikon) format, I decided to test some of these, using the file tool.

Expected result:

# file DSC_7421.NEF
DSC_7421.NEF: TIFF image data, little-endian, direntries=28, height=120, bps=352, compression=none, PhotometricIntepretation=RGB, manufacturer=NIKON CORPORATION, model=NIKON D750, orientation=upper-left, width=160

Result when data was corrupted:

# file DSC_9974.NEF
DSC_9974.NEF: data

I also had a couple of IO errors when i wrote ls in certain folders.

I decided to go to some of my large photo collections and test their integrity - first by listing the files and counting the number of lines in the output. Any read errors should then be written to the screen. Next, by checking if any of the NEF files were unrecognized, indication corrupted data. I filtered the output from file and counted the filtered lines.

# ls *.NEF -1 | wc -l
3641
# file *.NEF | grep "NEF: data" | wc -l
0

I did this for a lot of my photo folders, to ensure that all files were readable and that their content were recognized.

Using the 3856372992K size and 3856374016K offset, I got a lot of invalid data and missing files/folders, and I tried a couple of other values.

I found that the offset and size mentioned above, seemed pass my small tests.!

As seen above, the file system reports some errors. As I don't want to write any data to my devices before everything is recovered, I decided to make a snapshot write overlay, so all writes made by fsck.ext4 would be written to this file instead.

Make a 50GiB sparse file

# truncate /volume1/overlay.img -s50G

Make a virtual device

#losetup -f --show /volume1/overlay.img 
/dev/loop3

Get size of the device with the data:

# blockdev --getsz /dev/loop2
11711574528

Create the overlay device (before this, I have unmounted the file system at /dev/loop2)

# dmsetup create overlay --table "0 11711574528 snapshot /dev/loop2 /dev/loop3 P 8"

And the device was available at /dev/mapper/overlay

Finally I could check and fix the errors:

# fsck.ext4 -y -C 0 /dev/mapper/overlay

Note that the fixes are only written to the overlay file, and needs to be committed to the physical disks, if they should be permanent.

MD, partially grown from RAID1 to RAID5 but was interrupted, disks removed, and now file system is FUBAR

1 Answers1