6

This is a question regarding zfs on Linux (CentOS 7). I have a very simple setup with two 8 TB disks, one disk mirroring the other.

zpool create -f -o ashift=12 $zpoolName mirror $disksById

In case one of the disk needs to be replaced, the replacement disk must be of equal or greater size than the smallest of the two disk in the configuration, according to the zpool manual pages . And from what I have understood it is common that the exact size usually differs a bit between drives of different make and model (and model revision), even if they all are labelled 8 TB. However, I would like to be able to replace it with any other 8 TB disk, not necessarily by the same make and model.

How do I achieve this?

I would have expected an option to the zpool create command so that not the entire disk is used for the pool, but leaving some slack, however I cannot find such an option. The only suggestion that I have seen is partitioning the disk before creating the pool, creating one "pool" partition and one "slack" partition, but I've read the this will affect disk performance as the disk cache can not be used properly by zfs, so I suppose that I would like to avoid this.

joaerl
  • 397
  • 1
  • 3
  • 11
  • Why wouldn't you use the same disks? – ewwhite Mar 29 '17 at 13:01
  • @ewwhite Most common cases are 1) to spread risk across manufacturers and series of disks (manufacturing defects often affect multiple disks of a series, especially disks with serial numbers in close range); 2) because certain disks might not be available anymore or not available right now or have prices risen or have had very poor failure rates in operation. – user121391 Mar 30 '17 at 07:31
  • In practice, you shouldn't need to go out of your way to try to spread risk across disk manufacturers. And if you're buying from a good vendor (Dell, HP), they already build that diversity into their disk product lines. ZFS accounts for this as well, noted by Jim's answer below. And really, you won't see that much variance in advertised and actual size today. – ewwhite Mar 30 '17 at 07:52

2 Answers2

6

The only suggestion that I have seen is partitioning the disk before creating the pool, creating one "pool" partition and one "slack" partition

This is the correct answer.

but I've read the this will affect disk performance as the disk cache can not be used properly by zfs.

This is a misunderstanding. Using a partition rather than a full disk only affects performance if the partition is misaligned, which typically requires some real determination on the user's part, if you're using vaguely modern partition editors. Linux and BSD fdisk, sfdisk, and gparted all understand partition boundaries and work within them unless outright forced not to.

Further, if you look closely at a disk that's been fed whole to zfs, you'll notice that zfs has actually partitioned it itself. Example:

root@banshee:~# zpool status data
  pool: data
 state: ONLINE
  scan: scrub repaired 0 in 27h54m with 0 errors on Mon Mar 13 05:18:20 2017
config:

    NAME                                           STATE     READ WRITE CKSUM
    data                                           ONLINE       0     0     0
      mirror-0                                     ONLINE       0     0     0
        wwn-0x50014ee206fd9549  ONLINE       0     0     0
        wwn-0x50014ee2afb368a9    ONLINE       0     0     0
      mirror-1                                     ONLINE       0     0     0
        wwn-0x50014ee25d2510d4  ONLINE       0     0     0
        wwn-0x5001517bb29d5333  ONLINE       0     0     0

errors: No known data errors

root@banshee:~# ls -l /dev/disk/by-id | grep 510d4
lrwxrwxrwx 1 root root  9 Mar 22 15:57 wwn-0x50014ee25d2510d4 -> ../../sdd
lrwxrwxrwx 1 root root 10 Mar 22 15:57 wwn-0x50014ee25d2510d4-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Mar 22 15:57 wwn-0x50014ee25d2510d4-part9 -> ../../sdd9

As you can see, ZFS has already partitioned the raw disks in the pool. The pool uses partition 1; partition 9 is left slack.

root@banshee:~# sfdisk -d /dev/sdd
label: gpt
label-id: B2DED677-DB67-974C-80A6-070B72EB8CFB
device: /dev/sdd
unit: sectors
first-lba: 34
last-lba: 3907029134

/dev/sdd1 : start=        2048, size=  3907010560, type=6A898CC3-1DD2-11B2-99A6-080020736631, uuid=A570D0A4-EA32-F64F-80D8-7479D918924B, name="zfs"
/dev/sdd9 : start=  3907012608, size=       16384, type=6A945A3B-1DD2-11B2-99A6-080020736631, uuid=85D0957B-65AF-6B4A-9F1B-F902FE539170

sdd9 is 16384 sectors long. This is a 4K disk, so that comes out to 64M, and any disk that's no more than 63M-ish smaller than the existing disk should be fine as a replacement for this one, should it fail.

Jim Salter
  • 717
  • 5
  • 11
4

The only suggestion that I have seen is partitioning the disk before creating the pool

This is indeed the only solution to handle it. You don't have to create a second small partition, that space can stay unpartitioned. As The variance between disks is normally quite small, you lose only a few megabytes, which is no problem on 8TB disks.

An alternative would be to just buy the exact same model of disks (often not advised because of serial faults in manufacturing affecting just one model); or to start with the smallest disk and only replace with bigger disks, but this will cost you much more in lost space, and may not be possible after some time.

user121391
  • 2,502
  • 13
  • 31