In any distributed filesystem, one of the most crucial aspects when we consider operations on small files is network latency - it should be as small as possible (like 0.1 ms) between such distributed filesystem components. The best way to achieve it is to use reliable switch and connect all machines to the same switch.
Also, in distributed filesystems (especially in MooseFS) the best thing is scalability - it means, that the more nodes you have (and the more your calculations are distributed, i.e. done simultaneously on more than one mount), the faster the cluster is.
If you use MooseFS, please check out MooseFS 3.0, because operations on small files are improved since 3.0 version. This is an easy way for now, because you don't have to make a "revolution" (before upgrade remember to backup the /var/lib/mfs on Master Server - i.e. metadata). MooseFS can handle small files well, so maybe there's a problem in configuration?
In MooseFS additionally (still considering small files operations), one of the most important things is to have high CPU clock (like e.g. 3.7 GHz) with small amount of CPU cores and disabled energy saving options in BIOS for Master Server (because Master Server is a single-threaded process). For Chunkservers and Clients situation is different - they are multi-threaded, so you'll get better results while using multicore CPUs.
Additionally, as stated in MooseFS Best practices in paragraph 4. "Virtual Machines and MooseFS":
[...] we do not recommend running MooseFS components (especially Master Server(s)) on Virtual Machines.
So if you run MFS on VMs, you in fact may have poor results.