What is the recommended GlusterFS configuration for a growing website?

Question

I have a website that is tracking towards 50 million hits per day average, and within the next 3 months should be over 100 million hits per day. We are trying to use GlusterFS v 3.0.0 (with latest patches as of 1-17-2010)

Currently, we have just upgraded to a load balancer environment that has 3 physical hosts with 6 Xen-Server 5.5u1 VM's (2 on each host) to serve webpage traffic. Each machine has 6 Raid-6 local storage drives (7200RPM-SATA). The old machine we came from had 1 mirrored SAS 10k drive.

We also set up GlusterFS currently with 3 bricks, one on each host, and it is serving the 6 VM's as clients. In testing, everything seemed fine. However when we went to production, it seemed that there just wasn't enough I/O's available to serve traffic even upwards of 15M hits. Weeks prior, our old server was able to handle traffic, maxed out, at 20M.

Is there any recommended configurations for such an application, or things to be aware of that isn't apparent with their documentation at gluster.org for a site our size?

I suspect this has more to do with moving from 10K SAS drives, to 7.2K SATA Drives than glusterfs. — Zypher, Jan 23 '10 at 21:03
1 10k sas drive is by no means more I/O's than 6 7.2k drives. — , Jan 23 '10 at 21:38
What are you using GlusterFS for? What type/size of files are you storing on it? — James, Apr 16 '10 at 13:11
Yeah. Nothing in your description says you need a distributed file system. They come with a cost and should normally be avoided. — TomTom, Dec 19 '14 at 11:39

score 3 · Answer 1 · answered Jan 27 '10 at 04:07

RAID-6 of 6x7.2krpm drives with no write cache (?) is going to have terrible write performance, so terrible that it'll probably bog the disks down enough to really impact read performance too if your app has a healthy mix. I mean realistically you're looking at like 250 random iops in an 80/20 read/write split out of that array. If you're doing several hundred http requests per second then something as trivial as the apache access log is going to bog that down like a DoS attack.

If you can, redo those as raid10. It'll cost you some raw space but make a huge impact on i/o performance. And if you can get battery backed write cache on the raid cards its makes a very large difference.

I'm not familiar with glusterfs in particular, but all distributed filesystems tend to have the same basic problem, network-latency + complex locking = poor performance, especially on small files and especially on substantial-write workloads.

Slow disk i/o and a slow filesystem, this cluster design simply does not fit the workload. Is it too late to return the servers or at least the disk subsystems? If this is the primary platform of a substantial-revenue company you really should engage a professional.

score 1 · Answer 2 · answered Jan 23 '10 at 19:38

1

What medium are you moving your GlusterFS traffic over? If it's ethernet, you configuration will be severely limited due to the overheads of TCP/IP. GlusterFS is not the most efficient there. Where it really shines is over RDMA. You can achieve this with either Infiniband or 10GigE.

I'm also a bit unclear as to why you decided to put 2 virtual hosts on each physical host if they're all doing the same duties. Why not just run them on the bare metal and avoid the overhead?

answered Jan 23 '10 at 19:38

Kamil Kisiel

12,184
7
48
69

Much more flexible and quick to manage and backup vm's than bare metal. Infiniband and 10GigE is not a viable or cost effective solution at this moment in time. Currently it is on 1GigE connections. – Jan 23 '10 at 20:21

score 0 · Answer 3 · answered Jan 27 '10 at 18:14

What version of GlusterFs are you using? GlusterFS 3.0.0 is a major release and has many improvements including increase in small file performance.

There are many performance translators in GlusterFS that can be tuned for various workloads. For example, for increasing read performance we have read-ahead translator and for write performance we have write-behind translator. io-cache is another performance translator that can be used for caching.

What type of setup is yours? Are you using replicate or distribute or both? What is your network backend? Have you benchmarked network/disk IO between the old and the new servers to eliminate bottlenecks?

If you can share your volume files with us, we can help you tune your configuration files for optimum performance for your workloads.

Just an FYI, we offer 30 day free trial support subscription[1] where you can get your queries answered quickly and in-depth.

Cheers, Sachi

[1] http://www.gluster.com/products/trial.php

score 0 · Answer 4 · answered Apr 16 '10 at 13:05

Without more insight into your setup (e.g. is your website static or dynamic? Do database transactions take place on the servers using the same storage subsystem?) but RAID 6 is generally a bad choice for write performance, never mind when you introduce even more complexity through gluster. You potentially have two sets of write stripe translation going on, one at gluster level and one at the controller level. Then you have two parity calculations which slow things down and cause I/O blocking unless you have a large write cache and periods of low I/O activity.

I'd recommend you switch to RAID 10, and back this with either fibre channel or multiple bonded GigE links.

What is the recommended GlusterFS configuration for a growing website?

4 Answers4