1

I've seen many tutorials recommending glusterfs for distributed web application hosting. Is this really the best practice for small scale load balancing?

It seems the latency issue is slowing us down when we test this. Right now we have three nodes, each running nginx. They are all glusterfs servers and have the file system mounted to /var/www. We don't mind having a copy of php files on all servers, but perhaps there is a better way to sync changes than gluster?

Nate
  • 11
  • 2
  • Something like https://aws.amazon.com/efs/ perhaps? There are plugins to offload things like user-uploaded assets to S3 that might work, as well. – ceejayoz Feb 07 '19 at 21:47
  • Thanks for the suggestion @ceejayoz, but I'm looking to scale Wordpress itself horizontally (multiple servers to run PHP). EFS looks like it might work, but I was hoping for an open source solution I could implement on any cloud service. – Nate Feb 08 '19 at 00:26
  • I once ran a multi-server WP install with the WP files checked into a Git repository and deployed on multiple servers, and the `wp-uploads` folder as a mounted S3 bucket. Had to avoid the "edit the PHP files via WordPress's interface" thing, but otherwise it worked quite well. – ceejayoz Feb 08 '19 at 02:09

1 Answers1

0

This is an old question, so maybe you've already solved this problem. In case you didn't, here's my view about this issue.

We're running about 40 websites (mostly Wordpress, all PHP based) using glusterfs, and we've hit the same problems of latency that you mention. We've set aggressive caching both in glusterfs and in PHP-FPM, but still, it's slow. Very slow, compared to the performance you'd get using the disk of the VMs.

There is no silver bullet, I'm afraid. However, there are a few techniques that we've tried or plan to try, and that may help:

  • Use Redis as cache. For this you have to install a plugin that supports it, like W3 Total Cache in the case of Wordpress. I tried it in our lab and it's easy to configure, but couldn't test its effect on the performance of the production systems.
  • Use Ansible, Saltstack or the like to deploy the code in the VMs [1], while keeping glusterfs just for the directories that need to be shared. In the case of Wordpress, this could be the uploads directory. This option needs more maintenance and knowledge of the aforementioned software, but you'll get the best of both worlds: performance (due to PHP being read from local disks) and sharing of data.
  • You can also use rsync to deploy the code in the VMs. It's a little primitive, but works. However, I would recommend Ansible/Saltstack because you can also do things like reload services or invalidate caches afterwards.
  • Use a CDN, like AWS's Cloudfront.

I've used EFS in the past and it had the same problems (maybe not as slow, but not much faster). Network filesystems are always prone to latency, and the kind of operations that PHP frameworks do (like searching for the same file in several directories or accessing a lot of small files) is hard on them. In a deployment years ago we ended up going the rsync route because the website was very slow when loading directly from EFS.

The biggest problem when using a mixed deployment (code in the VMs, other files in shared storage) is that you have to make a straightforward way for developers to deploy code on their own, or you'll have to babysit every deployment. It can be done, but needs more work than just a share filesystem for everything.

Sorry for not addressing your problem more directly. Hope this helps.

[1] Ansible has a module "synchronize" that uses rsync, and that's what you want to use. If you use "copy", it will take much longer. The same happens with Saltstack: there is a "rsync" state which is faster than plain "file.managed".

rsuarez
  • 384
  • 5
  • 11