1

I have a 20TB RAID5 array (LSI 9265-8i / 8 x 3TB 7200rpm drives) configured with a 1MB stripe size.

What's the optimal way to partition, and format, this partition in Linux to maximize performance for a single ~20TB share?

I'm a linux newbie, so concrete examples are appreciated.

Bart De Vos
  • 17,911
  • 6
  • 63
  • 82
Jason
  • 11
  • 1
  • 2
  • Optimal? I like to have one single partition. And you? – mailq Dec 27 '11 at 19:09
  • Yes, one single partition. – Jason Dec 27 '11 at 19:11
  • 5
    You are aware aren't you that when you have a disk failure and replace the failed disk it'll take AT LEAST 10 hours to rebuild, probably more, meanwhile your array is at risk of total loss during this time? R5 is bad, especially with lots of cheapo large slow disks. – Chopper3 Dec 27 '11 at 19:49
  • 1
    This was closed as "too localized" but just what is it about this question that makes it localized? I'm voting to reopen this question because it doesn't have even the slightest hint of localization. – John Gardeniers Dec 27 '11 at 22:18
  • Agreed -- this was very useful for me. Why is it closed? – ensnare Dec 27 '11 at 23:04
  • See: http://meta.serverfault.com/questions/2461/silly-season-should-not-mean-crazy-voting – Bart De Vos Dec 29 '11 at 09:21
  • 1
    "optimal" is subjective. You might format a partition differently for lots of small source file snippets vs. a library of multi-gigabyte video clips. – Rob Moir Dec 29 '11 at 10:58

2 Answers2

4

First of all, I would suggest using RAID6 instead of RAID5. With such a big volume, an URE during the rebuild is likely enough to be worried about it, which would lead to a failed rebuild and lost data.

Then you will need a GPT partition table for a single volume of this size and if performance is the most important factor, I wouldn't use LVM or something similar, but this means you can't easily extend the volume later on.

After that, just use mkfs.xfs to create the FS.

Sven
  • 98,649
  • 14
  • 180
  • 226
  • I know, but the data is not crucial enough to allocate another 3TB to RAID6 redundancy. Are there any flags I should use in mkfs.xfs? Or just the defaults? – Jason Dec 27 '11 at 19:12
  • I always use the defaults and never had reasons to complain. For specific needs, this might be different, but you don't speak about specifics. – Sven Dec 27 '11 at 19:18
  • 1
    this guide may be helpful tuning xfs http://www.practicalsysadmin.com/wiki/index.php/XFS_optimisation – Sergei Dec 27 '11 at 19:19
  • Depending on the amount of data he has, I would suggest RAID 1+0, at least. – Rilindo Dec 27 '11 at 20:01
1

If what you're wondering about is stripe alignment, read the man page for "mkfs.xfs" and search for "sunit" and "swidth" (also called su and sw). From the man page:

sunit=value: This is used to specify the stripe unit for a RAID device or a logical volume. The value has to be  specified  in
                      512-byte  block units. Use the su suboption to specify the stripe unit size in bytes. This suboption ensures that
                      data allocations will be stripe unit aligned when the current end of file is being extended and the file size  is
                      larger than 512KiB. Also inode allocations and the internal log will be stripe unit aligned.

swidth=value
                      This is used to specify the stripe width for a RAID device or a striped logical volume. The value has to be spec-
                      ified in 512-byte block units. Use the sw suboption to specify the stripe width size in bytes.  This suboption is
                      required if -d sunit has been specified and it has to be a multiple of the -d sunit suboption.

Quick recap:
sunit : stripe unit in 512 byte blocks

swidth : stripe width = sunit * $num_data_disks

Since you have an 8 disk RAID5 (distributed parity) $num_data_disks = 8

Stripe size = 1M = 1024kB

So, to format mkfs.xfs -d su=1024k,sw=8 /dev/sd{X}

This can also be found on the XFS.org FAQ.

Kendall
  • 1,063
  • 12
  • 25
  • 1
    RAID5 doesn't have a dedicated parity drive, the parity is spread across all drives. RAID4 does have a dedicated parity drive. – HampusLi Dec 27 '11 at 19:37
  • ^ You're absolutely correct. – Kendall Dec 27 '11 at 19:42
  • Depends on the implementation, could as well be a dedicated parity drive - and for the sake of the RAID stride you do the calculation as if the parity drive would be a dedicated one. – pfo Dec 29 '11 at 10:04
  • @pfo: Do you have a link for that? I originally thought it was as you describe, but now am not sure --and since I don't use RAID5, it's not something I know offhand. – Kendall Dec 29 '11 at 16:32