3

I have 4 identical servers each with 1+3 SSD for data (no RAID) + 2x10G & 2x1G net. I want to have an active-passive NFS setup with a single export. Probably I will go with 2 clusters so 2 active NFS servers/exports which will be mounted by many clients (as home dirs) with autofs so which server has the data is not important (having always a running cluster if one is down is a plus - upgrade/migration/failure); both will be backup-ed. I was thinking of going CEPH initially but:

  • exporting with NFS and having an u/g/p quota I think it's impossible(?)
  • going cephfs directly from clients complicates things (isolation, access control etc)
  • plus replica 2 vs 3 = more capacity.

I still don't understand what's the relationship or how it's best and [dis]advantages:

  1. 1xResource with 3x(Volumes < SSD) = 3x rdbd > 1xLV > XFS
  2. 1xResource with 1x rdbd < 1LV < 3xSSD = rdbd > [?LV >] XFS [link]

I believe it's like with RAID 0+1 or 1+0 (with the diff. that is not striped) so option (1) is better to have individual disks replicated (multiple meta-disk) than a single volume replicated (single meta-disk probability to fail all), which in case of a disk failure is better ... I think? It's all about how a disk failure is handled right? Other suggestions or pov?

Bog
  • 31
  • 3

1 Answers1

4

With 4 servers and all-SSD w/out any RAID your configuration simply begs for Ceph erasure-coded setup!

  1. E/C pool creation (assuming you have Ceph cluster already)

https://subscription.packtpub.com/book/cloud-&-networking/9781784393502/8/ch08lvl1sec89/creating-an-erasure-coded-pool

  1. NFS services on top of the freshly created in (1) E/C pool

https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html/dashboard_guide/management-of-nfs-ganesha-exports-on-the-ceph-dashboard

Do not use DRBD! It’s storage inefficient due to enormous replication overhead and “split brain” is DRBD’s second name…

RiGiD5
  • 1,241
  • 1
  • 8
  • 12
  • 3
    On four nodes, Ceph cluster plus NFS on top does make the most sense. Totally agree on drbd. I got frequent issues in my cluster with drives getting in inconsistent state on one of the nodes, then trying to resync manually and then finding out several times there was actually a split brain. If going for separate two-node clusters, it's just better to use free Starwind version for storage. – Strepsils Jan 15 '23 at 09:18