Distributed file system (e.g., glusterfs) without replication

Question

I have a controller server, and two 24 SAS fileserver arrays. Each fileserver array is set up as a ZFS pool with 4 RAIDZ2 vdevs with 6 drives each.

Ultimately, I want to be able to use the controller server as a single mount point for the two fileservers. The main idea being that my end users only have to access the one controller server and one mount point in order to access / back up their data across the combined storage capacity of two servers. I was thinking of using glusterfs for this, but the information I've been able to find so far seems to be for generation of redundant storage pools. In this application, gluster is being used as a distributed mirror RAID1, which doesn't really fit my need as it would create another level of unneeded redundancy.

How would you suggest I go about creating a non-redundant distributed filesystem from the two ZFS pools hosted on separate computers? Is there a way to accomplish this with gluster, or if there a different NFS that better suited to accomplishing this? I've considered samba as well, but I'm not sure if it will be secure enough. I genuinely liked the prospect of only having ssh publicly hosted on the controller with strict 2FA authentication.

Maybe I'm just over thinking this, because when I look at the documentation for gluster, it seems to be the case the default volume construction does not do any sort of replication: e.g., gluster volume create test-volume server1:/exp1 server2:/exp2 server3:/exp3 server4:/exp4 — kjskjk kjdksjk, Mar 04 '17 at 00:22
https://cloud.githubusercontent.com/assets/10970993/7412364/ac0a300c-ef5f-11e4-8599-e7d06de1165c.png — kjskjk kjdksjk, Mar 04 '17 at 00:22

Spooler · Accepted Answer · 2017-03-04T01:09:00.673

You'll need to make the choice as to whether you want a distributed or striped volume.

Distributed volumes are simple, and will hash out any files written to each node in as balanced a way as possible. It does nothing to the files themselves, and you'll see intact files on your ZFS "brick". Striped volumes will cut up your files and distribute them between nodes as chunks. This is a desireable configuration if your content is almost exclusively large files (such as videos, disk images, and backups).

You're just about on the nose thus far, as the command to create such a distributed volume is as simple as you've stated:

gluster volume create test-volume server1:/exp1 server2:/exp2 server3:/exp3 server4:/exp4

While a striped volume is created thusly:

gluster volume create test-volume stripe 4 server1:/exp1 server2:/exp2 server3:/exp3 server4:/exp4

HOWEVER, four stripes can cause a lot of overhead when grabbing a file. One would only do that if they were working with truly massive files. For a good compromise that will still work for large disk images and the like, I would suggest a distributed striped volume:

gluster volume create test-volume stripe 2 server1:/exp1 server2:/exp2 server3:/exp3 server4:/exp4

This volume will stripe across two nodes, and distribute striped sets across another two nodes.

Thank you for the suggestion. I think I'll test both no stripping and stripping 2. Just one followup for the striping option, say if I strip2, if one of the bricks goes down in a two brick system, then the entire glusterfs volume goes down, correct? Whereas with no stripping, if one goes down the other could still be mounted and accessible. Probably very unlikely with an underlying 4x array RAIDZ2 on both bricks, but still just curious for posterity sake. — kjskjk kjdksjk, Mar 04 '17 at 01:24
Yes, ripping a brick out of a striped volume will cause the volume to freeze and become inaccessible until the brick is replaced - intact. Distributed will do almost the same thing, but you'll be losing half of your files instead of all of them. Not awesome either. In this case, you're offering data integrity via the RAID abstractions on your bricks. This is fine at small scale, or when restoring the data set from backups isn't a big deal at large scale. — Spooler, Mar 04 '17 at 01:28

Distributed file system (e.g., glusterfs) without replication

1 Answers1