1

I'm installing a backup server with a Debian10 installed and running on its NVMe disks. It also has 4x6TB SATA hard drives, and I'm trying to set these up as a RAID0 array.

I'm usually following the explanations here: https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-debian-9 which always worked for me in Debian9.

The symptom is that when I reboot my server my RAID0 array is gone.

The error in /var/log/syslog reads:

dev-md3.device: Job dev-md3.device/start timed out.
Timed out waiting for device /dev/md3.
Dependency failed for /mnt/md3.
mnt-md3.mount: Job mnt-md3.mount/start failed with result 'dependency'.
dev-md3.device: Job dev-md3.device/start failed with result 'timeout'.

My RAID0 setup procedure is like this:

$ sudo mdadm --version

mdadm - v4.1 - 2018-10-01

$ sudo lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT

NAME         SIZE FSTYPE  TYPE MOUNTPOINT
sda          5.5T         disk
sdb          5.5T         disk
sdc          5.5T         disk
sdd          5.5T         disk
nvme0n1      477G         disk
├─nvme0n1p1  511M vfat    part /boot/efi
├─nvme0n1p2  476G ext4    part /
├─nvme0n1p3  511M swap    part [SWAP]
└─nvme0n1p4    1M iso9660 part

$ sudo mdadm --create --verbose /dev/md3 --level=0 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd

mdadm: chunk size defaults to 512K
mdadm: partition table exists on /dev/sda
mdadm: partition table exists on /dev/sda but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sdb
mdadm: partition table exists on /dev/sdb but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sdc
mdadm: partition table exists on /dev/sdc but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sdd
mdadm: partition table exists on /dev/sdd but will be lost or
       meaningless after creating array
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md3 started.

$ sudo cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid0 sdd[3] sdc[2] sdb[1] sda[0]
      23441561600 blocks super 1.2 512k chunks

unused devices: <none>

$ sudo mkfs.ext4 -F /dev/md3

mke2fs 1.44.5 (15-Dec-2018)
/dev/md3 contains a ext4 file system
        last mounted on Sat Dec 21 10:42:04 2019
Creating filesystem with 5860390400 4k blocks and 366274560 inodes
Filesystem UUID: f8f61563-66ab-4cc6-9876-7f6160c43853
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848, 512000000, 550731776, 644972544, 1934917632,
        2560000000, 3855122432, 5804752896

Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done

$ sudo mkdir -p /mnt/md3

$ sudo mount /dev/md3 /mnt/md3

$ sudo df -h -x devtmpfs -x tmpfs

Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p2  469G  2.5G  443G   1% /
/dev/nvme0n1p1  510M  5.1M  505M   1% /boot/efi
/dev/md3         22T   24K   21T   1% /mnt/md3

At this point my RAID0 is live and I can use it. Now (trying to) save the above into permanent setup:

# mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf

ARRAY /dev/md3 metadata=1.2 name=xxxxx:3 UUID=71ca2d63:66f64678:02822188:2c2881ba

# echo '/dev/md3 /mnt/md3 ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab

/dev/md3 /mnt/md3 ext4 defaults,nofail,discard 0 0

Actual content of /etc/mdadm/mdadm.conf (without comments):

HOMEHOST <system>
MAILADDR root
ARRAY /dev/md3 metadata=1.2 name=xxxxx:3 UUID=71ca2d63:66f64678:02822188:2c2881ba

Now if I reboot the RAID0 array isn't there, and I have the errors I laid on top of this post in syslog

Update

Tried the same with a RAID10 --level=10, same result

BxlSofty
  • 753
  • 1
  • 5
  • 11
  • Please note that configuring the array as a RAID0 device is an *extremely* bad idea, unless is contains scratch/temp data only – shodanshok Dec 21 '19 at 13:34
  • @shodanshok What if he has a correct, incremental backup and HA is not a very big concern, but performance is? – peterh Dec 21 '19 at 13:37
  • @shodanshok we use redundancy through an array of servers having the same backup data, belonging to different providers. This configuration minimizes quite amazingly the single point of failures. Can you help us out fixing this issue? – BxlSofty Dec 21 '19 at 13:50
  • "a pretty big backup server"? I work with backup servers that holds backups of many production servers with more disk than that. – Henrik supports the community Dec 21 '19 at 14:29
  • Anyone volunteering to find a solution for me instead of criticizing the wording or technological choices ;) ? – BxlSofty Dec 21 '19 at 15:47
  • @OldGuy modern `mdadm` arrays are auto-scanned and auto-assembled automatically by the kernel. Try *removing* the `/etc/mdadm/mdadm.conf` file and reboot. Does it change anything? – shodanshok Dec 21 '19 at 23:34
  • @shodanshok thanks for the suggestion but same thing. – BxlSofty Dec 22 '19 at 12:36
  • I may have found something, and welcoming any explanation. When I specify a device name for the array (/dev/md3) at creation time it didn't seem to be well understood. At reboot there is actually a raid volume but with a name /dev/md127. Hence, no mounting point after reboot since fstab expects a device /dev/md3 – BxlSofty Dec 23 '19 at 11:00
  • Problem with the above is, mounting that /dev/md127 device doesn't _totally work_. I can use the drive but the mount doesn't appear in the `df -h` list or in the `lsblk` – BxlSofty Dec 23 '19 at 11:26

2 Answers2

0

In my case the new initramfs file was needed. I'm moving Debian 9 installation to software raid1 for redundancy and the installation was never used mdadm before. So after the installation the packet, creating test raid1 array and before restarting I had to re-issue mkinitramfs and correct /etc/lilo.conf initrd= option to point to correct image file.

Using the previous, unchanged initramfs image resulted in a significant delay on boot, timeout in md0 start. But in the system there was system-detected /dev/md127 array from my specified partitions

So before restart the system with properly created array should you execute mkinitramfs and then update your bootstrap loader?

0

Everything you did looks totally correct. Those partition table exists on /dev/sd[abcd] but will be lost or meaningless after creating array messages are concerning, though.

This is a guess -- but I think what may be happening is that those drives had a GUID partition table on them. When you create the RAID 0, the MD metadata goes at the front of the drive, blowing away the partition table there. But, there's a backup copy of the partition table stored at the end of the drive. My guess is that the system is recovering from the backup copy each boot. With RAID 0, there's no initial synchronization. It just accepts the data on the drives at it is.

Try clearing the partition tables off the drives with wipefs -a /dev/sd[abcd] first. Then create the array with your procedure.

Mike Andrews
  • 383
  • 1
  • 7
  • thanks for the suggestion. Did not make a difference, even though the warnings for existing partitions did indeed go away – BxlSofty Dec 23 '19 at 10:57