3

I want to build a HA Web service and I was planning to use Glusterfs on three nodes (with replica 3).

My plan was to install web server directly on Gluster nodes.

Is this a viable solution or is there a strong reason to use dedicated Gluster nodes?

Thank you.

P.

P. P.
  • 33
  • 3

3 Answers3

5

That is not a problem at all , remember:

Speed of NFS/FUSE

in another thread , it is stated

From my experience, the performance differences are huge. After switching my web app from FUSE to NFS load times decreased from 1.5 - > 4 seconds to under 1 second. Also I tried extracting some archives today and it seems to take 4-5 times longer on FUSE

a benchmark example is here

Benchmark1 Benchmark2

Improving FUSE speed:

In the gluster mailing list there are 2 hints on improving speed with negative-timeout :

mount -t glusterfs -o negative-timeout=1,use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www
mount -t glusterfs -o use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log 192.168.140.41:/www /var/www So it means only 1 second negative timeout... In this particular test: ./smallfile_cli.py --top /var/www/test --host-set 192.168.140.41 --threads 8 --files 50000 --file-size 64 --record-size 64

The result is about 4 seconds with the negative timeout of 1 second defined and many many minutes without the negative timeout (I quit after 15 minutes of waiting)

AND

PS: I already found out that for this particular test all the difference is made by : negative-timeout=600 , when removing it, it's much much slower again.

Bash Stack
  • 430
  • 2
  • 6
  • an NFS mount will require a floating (or virtual IP) to provide failover, which will complicate things (you need a cluster resource manager) – shodanshok Jun 19 '20 at 16:58
  • Thank you. Are there any performance comparisons between NFS and Gluster native client mounts? – P. P. Jun 19 '20 at 17:14
  • @P.P. answer updated , outdated benchmark https://docplayer.net/docs-images/44/15088228/images/page_20.jpg , indeed small file perfomance WAS better then , but basically the locking( php session) folder requires quick lock->write->unlock operations at least , when you are "just reading", be sure have nough ram and use more than one exported volume ;) – Bash Stack Jun 19 '20 at 17:36
  • @shodanshok inserted CTDB into answer , its not too hard, having 2-4 times better performance for years should be worth 10 minutes setup ;) – Bash Stack Jun 20 '20 at 10:11
  • 1
    While I agree that a cluster resource manager is not impossibly hard to configure and manage (especially a very simple and focused one as CTDB), the OP should be sure to understand its possible caveats. So while I agree with (and upvoted) your answer, he should benchmark his workload to understand how it performs on both fuse and NFS mounts before taking that path. – shodanshok Jun 20 '20 at 11:02
  • @shodanshok thy .. ( also backupvolfile-server is not implied by default so one would have to tweak either way ) – Bash Stack Jun 20 '20 at 17:15
  • It seems that fine tuning options in Gluster mount can really give huge benefits compared to default settings (setting negative-timeout ) as reported here : https://lists.gluster.org/pipermail/gluster-users/2017-July/031788.html But I still did not tested this myself. – P. P. Jun 21 '20 at 18:34
  • 1
    @P.P. answer updated ;) – Bash Stack Jun 23 '20 at 10:27
4

Yes, you can run GlusterFS nodes directly on your web server instances, but keep in mind that it can use a lot of CPU, taking away CPU resources from your web application. You should test your app to see if it will have sufficient CPU and other resources to run converged with GlusterFS; if not, you should upgrade the hardware or use dedicated GlusterFS nodes.

Michael Hampton
  • 244,070
  • 43
  • 506
  • 972
1

Using a shared file system is one of the options you have to maintain high available web service/site which is not really the best.

The main income of doing so is when you have many uploads and writes on your service the best practice is to split that part of your service which is used in uploads, to write to an specific folder, then use Gluster to just replicate THAT folder. static files could be served using a CDN, or a cache server, small amounts of user data could be stored in databases like SQL and Redis, or distributed cache servers like memcached or even distributed object storage services like min.io

Aref Riant
  • 111
  • 3