5

I have two hard drive partitions, which I have combined into a RAID1 using mdadm, and created an ext4 filesystem on the resulting device.

When I mdadm --zero-superblock the two partitions, and re-create the RAID, then the original ext4 metadata is magically preserved.

Why is that?

And how can I tell mdadm to give me a truly new, uninitialised MD?


Details

How I create the RAID1 and file system:

ls /dev/sdc2  # partition 1
ls /dev/sdd2  # partition 2
mdadm --create --run --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sdc2 /dev/sdd2
mkfs.ext4 -L mylabel /dev/md1

Wipe RAID1:

mdadm --stop /dev/md1
mdadm --zero-superblock /dev/sdc2
mdadm --zero-superblock /dev/sdd2

Recreate RAID1:

mdadm --create --run --verbose /dev/md1 --level=1 --raid-devices=2 /dev/sdc2 /dev/sdd2

Display device information (note wipefs without the -a flag doesn't wipe anything but just shows info):

# wipefs /dev/md1
offset               type
----------------------------------------------------------------
0x438                ext4   [filesystem]
                     LABEL: mylabel
                     UUID:  3d230d31-fb82-46ef-a4e0-e9473e05825c

LABEL: mylabel shows that the ext4 label "survived" the mdadm superblock wipe and RAID recreation.

How can that be?

I thought that after a superblock wipe and recreation, mdadm is supposed to present me with a "clean" view of the device (i.e. all zeros), unless a flag is given that turns that off (such as --assume-clean, which I haven't given).

nh2
  • 818
  • 3
  • 11
  • 21

2 Answers2

9

Because zero-ing the mdadm superblock only removes the metadata that describes the raid array, it doesn’t remove information about what is actually on the rest of the disk. This is actually a good thing, as it means you might be able to recover a volume when the array itself won’t assemble for some unfortunate reason.

And to be clear: because you’re creating the array using volumes and not whole-disk, the partition table isn’t touched by mdadm, so zero-ing the raid superblock isn’t going to affect the drive label nor the partition structure.

RibaldEddie
  • 303
  • 2
  • 8
  • I suggest you add something to your answer that responds to what the second question was effectively: "How do I start fresh?" I suspect that 'dd if=/dev/zero of=/dev/sdc2' and 'dd if=/dev/zero of=/dev/sdd2' would do the trick, for example (you can probably get something much faster that just clears enough to fool filesystem auto-detection.) – Slartibartfast May 09 '18 at 03:03
  • @Slartibartfast yeah missed that part. – RibaldEddie May 09 '18 at 03:04
  • A few questions: 1) If zeroing the superblock does not have any effect on what reads return from the RAID, then data from _which_ of the two devices will reads return? And given that mdadm immediately starts filling the drive with zeros after creation in the initial scrub, does it mean that within the first couple of minutes, I can read arbitrary data from (one of) the disks, and a few minutes later I will read zeros? 2) I'm not sure about what you say in the "to be clear" part. The ext4 `LABEL: mylabel` should surely reside _within_ the boundaries of the raid device, right? continuing ... – nh2 May 11 '18 at 02:02
  • So I'd expect `LABEL: mylabel` to be probably somewhere at the beginning of `/dev/md1`. That's why I'm not sure why you mention "the drive label" or "the partition structure". And of course 3) how can I tell mdadm to return "fresh" data (zeros)? `dd`ing the whole 10TB drive takes awefully long and mdadm must certainly know which areas are untouched since the creation (as it knows what to scrub in the initial scrub). – nh2 May 11 '18 at 02:05
  • @nh2 I think there’s some misunderstanding. Zeroing the superblock isn’t having an effect on reads of the volume or partition table because you’re not altering that data when you create or destroy the raid device. – RibaldEddie May 11 '18 at 07:01
  • Here’s another way to think about it: Linux kernel takes a very layered approach to disk management. That’s why there are multiple and separate kernel areas for raid, logical volume management, block device. When you try to read a partition table on a device, if it’s there, the kernel can make sense of that regardless of the presence of any mdraid metadata because the subsystem that reads the partition table doesn’t know or care about mdraid metadata. Does that make more sense to you? – RibaldEddie May 11 '18 at 15:47
  • @RibaldEddie There's still something I don't understand, which is how mdraid decides where to read the data from. Assume that after I stop the array and wipe the superblocks, I write garbage data to one of the two devices (say `sdc2`). Then after I recreate the array, what will happen? Will reads from `md1` return garbage, or good data, or nondeterministic results? Naturally, what `wipefs` thinks about the ext4 filesystem on `md1` must also depend on that answer. – nh2 Nov 22 '18 at 17:32
  • The answer is not very clear. The author wanted to wipe raid signatures (including raid label). But after doing that, it is still there. Why that is happening? – Ashark Nov 04 '20 at 11:41
0

Have yet to see a definitive answer to "remove raid metadata completely?"

here's mine (snippet of raid creation script when previous RAID detected):

DISK=drive with partition (eg: /dev/sdc3)
DRIVE=drive without partition (eg: /dev/sdc)
DRIVE_SECTORS=$(fdisk -l $DRIVE | grep Disk | grep sectors | cut -f7 -d ' ')
RAID_OFFSET=$(wipefs $DISK | grep linux_raid_member | sed 's/  */ /g' | cut -d ' ' -f2)
wipefs -o $RAID_OFFSET $DISK
# Zero last 4M of DISK
dd bs=512 if=/dev/zero of=$DRIVE count=8182 seek=$(($DRIVE_SECTORS - 8192))

Hope it saves someone else the time it cost me to sort this out.

Pierre.Vriens
  • 1,159
  • 34
  • 15
  • 19