1

TL;DR I have three 2 TB drives and one 500 GB drive in a machine running Ubuntu 20.04 with a default ZFS install on the small drive. My plan is to partition the larger drives into two partitions each, of 2 GB (the size of the default bpool partition) and the remaining 2,046 GB, and make my bpool a four-way, 2 GB mirror, and my rpool a much larger RAIDz1 (> 5 GB) or RAIDz2 (< 3 TB). Is this possible/reasonable, or are there considerations I’m unaware of?

I’ve just installed Ubuntu 20.04 on a machine to play around with the ZFS capabilities, and am trying to figure out the best way to set it up. The machine has four drives in it (3x 2TB & 1x 500 GB). I used the default ZFS install with the smallest drive, so right now the three large drives are unused, and the small drive has two partitions, one being used as a single-device vdev in the bpool, and the other for the rpool. Once I’ve figured out how I want to get everything configured, I plan on using this machine as a small, private file, git, and web server. Load on the server will be low, and any critical files will still be backed up off-site, so I’m generally going to favor storage capacity over performance and redundancy, but I’m currently using old drives, so I definitely also want to consider fault tolerance. I’m somewhat familiar with the operation of ZFS, and have used it in the past, but am not particularly experienced with it.

As I understand Ubuntu’s ZFS setup and GRUB integration, I need to be wary of messing with the bpool, but at the very least I want it to have some redundancy across multiple physical drives. To accomplish this, I obviously need more devices to add to the bpool’s vdev, drawing from the other disks, but because the bpool needs to remain independent, and I don’t want to give up an entire 2 TB drive for a pool that won’t require much space, I suspect that this means my best bet is to partition one (or more) of the remaining drives, and use the new partition(s) to turn the bpool’s vdev into a mirror or RAIDz. Given the nature of the bpool, and the fact that the four disks I’m using are literally just the largest disks I had laying around of various ages and histories, I’m thinking that the way to go for the bpool is to simply mirror it across partitions on all four drives to minimize the risk of the server going down.

For the rest, and because I’ll necessarily be dealing with heterogenous devices if I don’t want to waste space, I’m thinking that all remaining partitions from all drives be used to change my rpool configuration into a RAIDz vdev, and this is where my knowledge is starting to hit its limits, and I have some questions:

  1. Is there any reason to not use the rpool for everything else, and instead add another zpool? My understanding is that this would have little effect besides to complicate my topology and make it more difficult to efficiently allocate my drive space, but is there some factor I’m not considering?
  2. Similarly, is there any reason that I’d want to consider more than one vdev within the rpool? I don’t think so, but again want to make sure I’m not missing something.

The default configuration that the Ubuntu installer gave me is a 2 GB partition for the bpool, and the remaining 498 GB for the rpool. If I follow the plan above, my final configuration would be a quadruple-redundant 2 GB bpool, and an rpool with a single vdev consisting of one 498 GB partition and three 2,044 GB partitions, which, with a RAIDz1 configuration should—I think—leave me with just under 5 TB of usable space (or a little over 3 TB with RAIDz2).

Are there any complications that I’m missing, or technical limitations that I should be considering?

  • Ok, after doing some more reading, it seems as though I still need homogenous devices to avoid wasting space with my vdevs. Not sure why I thought otherwise. So I guess this needs some more thought put into it. – Josh Ourisman Jun 28 '20 at 03:44

1 Answers1

1

After doing more research on the situation, I've realized that I had a fundamental misunderstanding about how RAIDz worked. For some reason, I thought it would work with heterogeneous devices without wasting space, but that is not actually the case (I must have just been thinking about non-redundant storage pooling, which is not something I want to consider for this application).

My new plan is to instead just buy another 2 TB drive (not like they're expensive anymore), and replace the 500 GB drive with that, so I can have a properly homogeneous set of devices to work with. That should allow for a properly efficient use of space without resorting to an exotic topography. If I run into any other complications, I'll create a new question for them, and consider this one answered.