I'm currently stuck with my current situation in creating a Distributed File System.
I have
- 2x 8TB drives
- 2x 4TB drives
- 2x 2TB drives
- 2x 1TB drives
I've been looking into glusterfs and it seems the most obvious direction for me to go in is a replicated Volume. But from what I understand, a replica 2 means you specify the 2 disks of equal size at a time so one becomes a replica of the other.
However, I've seen mention of the disperse volume type in glusterfs but they recommend drives of equal size. From my understanding, disperse is more space efficient than just replicating disks as more are added.
So my questions are.
Am I over complicating this?
Should I just stick to replica 2?
I can consider buying 2 more 8TB drives to replace all the smaller ones and make my life easier for getting parity but this is a large financial impact at the moment.
If getting 2 extra 8TB drives is the most logical for a RAID 5/6 like setup, is there anymore information on the dispersed volume type in gluster as google searching doesn't seem to bring up as much helpful guides etc compared to the more established volume types.
Or am I being a n00b and just buy the 2 8TB drives and use hardware raid or zfs?
My only final comment on that though is if an 8TB drive fails, it's much more costly to replace than if a smaller drive dies so i've gotten myself a little stuck with the initial 8TB drive purchases but please consider the financial repercussions in your answer. I'm aware matching drive capacity would be the correct way to go about it but I'd rather not spend (because it's a lot of money) and then have drives that aren't doing anything.
The usage of these drives is a very large collection of media (Plex media storage, home videos etc). Also, all my data is backed up onto crashplan using encfs' reverse encryption for a bit of added security so this question is not about backing up data, just "best case" redundancy for my situation.
Thank you