I've set up 4 distributed replicated servers with glusterfs on top of xfs partitions on hyperv server (dynamic vhdx) virtual disks. The NICs are 6x1Gbit (teamed on hyperv). I share it through samba to windows clients. The problem I run into is that I have really bad performance with lots of small files (read and write), when there are a lot of 10k files I have transfer rates around 300kb (and on native client is not that much faster too). Is there any solution to that problem? Or is my configuration bad? The big files transfers are good (utilizing all bandwidth)
2 Answers
I attempted to use GlusterFS for web application deployment and sharing a large base of user uploaded files between several servers at one point. I spent probably a good 4 months trying to get the speed reasonable, but I never could. You can tweak it for about a 25-40% speed increase if you really try, but its won't be fast enough still.
I forget the exact technical details, but the GlusterFS protocol is very verbose, even on read only systems. As Danila said, you are better off using the NFS protocol through gluster if you want small file sharing. The huge downside of that is that is NFS.
One other option to look at is Ceph. Its developing quickly and it's quite usable on the latest Ubuntu kernels.
To be honest though, I'd recommend ditching a shared FS if you can. You'll thank me later.

- 1,589
- 9
- 14
-
But I need the data to be replicated . So what do you suggest instead? What do you think about zfs with ramdisk cache and drbd to replicate? – piotrektt Sep 04 '13 at 20:22
-
you can try GFSv2 but you will need to deal with all clustering thing. Lustre might be solution as well. – Danila Ladner Sep 04 '13 at 20:23
-
Do you have any experience with lustre? Is that much faster? – piotrektt Sep 04 '13 at 20:42
-
I have no experience with Lustre or ZFS. Another option if you can use something other than a mountable FS would be something along the lines of Riak. Riak and some of other nosql databases can operate very well at scale and handle replication and failover very well. – Kyle Sep 04 '13 at 20:52
GlusterFS native FUSE client is terrible with large amount of small files. You can try to use NFS also with GlusterFS. Also I do not think xfs partitions give you any advantages in this setup over native EXT4 at all. You can read some more info in this article:

- 5,331
- 22
- 31
-
Xfs is better with previous gluster versions because of the ext3/4 gluster bug. – Jure1873 Sep 04 '13 at 21:35
-
Well, I have used only 3.1. At that time there was no bug for ext4, at least not to my knowledge. – Danila Ladner Sep 04 '13 at 21:52
-
I will give it a try tomorrow (ext4 and less virtual processors). I will post the result here. – piotrektt Sep 04 '13 at 21:58
-
With small files fuse is your bottleneck though + the replication process to all 4 nodes or 2 depending on how you configured the bricks. – Danila Ladner Sep 04 '13 at 23:06
-
I've also had a lot of problems with small files (mail storage) and fuse, nfs is definitely faster. As for the ext4 bug you should use ver. 3.3.2 or 3.4 (https://bugzilla.redhat.com/show_bug.cgi?id=838784) – Jure1873 Sep 06 '13 at 07:13