How to partition directory system for GlusterFS?

Question

We have 3 folders on an Ubuntu 14.04 machine, with each one containing 250K pictures with the size of 2KB-30KB in each folder, expecting to grow until 1M files per directory.

While trying to scale the Application to several servers we are looking into Glusterfs for a shared storage. As 250K files are not a problem on ext4 it seems to be problematic for glusterfs. Trying to copy the files crashes the machine entirely.

I am look to partition the files into directories in 2 levels:

mkdir -p {000..255}/{000..255}

/000/000/filename
/001/000/filename
/001/001/filename
...

Does this sound like a good feasonable way? The entire structure will contain millions of files later on. Would this allow glusterfs to be reliable in production with a good performance, hosting millions of files?

I also tried this: "cp /data/files/* /gluster/files/ &" which resulted in "-bash: /bin/cp: Argument list too long" — merlin, Aug 24 '15 at 08:07
good idea, but still: rsync /data/files/* . -bash: /usr/bin/rsync: Argument list too long — merlin, Aug 24 '15 at 08:19
Ah, sorry. Do `rsync -aHS --progress /data/files/ /gluster/files/` or you can use a find command like: `find /data/files/ -name '*name*.ext' -exec cp -p {} /gluster/files/ \;` — Gene, Aug 24 '15 at 08:22
that seems to work! It started to copy the files. How much time to you think this might take for 250K files? It looks like this could take hours. — merlin, Aug 24 '15 at 08:28
It depends on how faster your network, storage (drives), and servers are. Can't really say. You could run a `find` command directly on one of the storage servers to keep track (`find /path/to/brick -type f | wc -l`) — Gene, Aug 24 '15 at 08:35
it runs on 1G network and SSDs. Currently sync does about 1.200 files per minute. — merlin, Aug 24 '15 at 08:46
The sync failed after 50K files and node2 froze entirely. :-( — merlin, Aug 24 '15 at 09:24
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/27348/discussion-between-gene-and-merlin). — Gene, Aug 24 '15 at 16:39
Rather than using 000-999 or whatever, you should use something that is naturally related to the data, such as the first part of its filename, its creation date, or some other metadata which will be easy and fast for your application to look up and insert into the path in order to store or retrieve the image. — Michael Hampton, Aug 24 '15 at 16:58

Gene · Accepted Answer · 2015-08-24T08:52:33.793

Using GlusterFS to store and access lots and lots of very small files is a difficulty many implementations face, and it seems you're already on a good path to solve the problem: breaking the files up into separate directories.

You could implement a solution like that. Just create a bunch of directories, choose a limit for how many files can go in each directory, and hope you don't run out of places to place files. In your example you're creating 65k+ directories, so that's not likely to be a problem any time soon.

Another option is to create directories based on the date a file is created. For example if the file cust_logo_xad.png was created today it would be stored here:

/gluster/files/2015/08/24/cust_logo_xad.png

If you're hosting data for different entities (customers, departments, etc) you could separate files based on ownership, assigning the entity a unique ID of some sort. For example:

/gluster/files/ry/ry7eg4k/cust_logo_xad.png

Beyond that it would be a good idea to take a look at the GlusterFS documentation for tuning the storage cluster for hosting small files. At the very least make sure that:

The file systems on the GlusterFS storage servers have enough free inodes available (mkfs option)
The drives on the GlusterFS storage servers can handle lots of IOPs.
You use an appropriate file system for the task (either ext4 or xfs)
Your application / staff doesn't try to scan directories with lots of small files frequently.

If you can (and if you haven't already) it's a good idea to create a database to act as an index for the files, rather than having to scan (e.g. ls) or search (e.g. find) for files all of the time.

Thank you for the detailed answer. It currently syncs with 1300 files/min. Actually I never have to do a ls or find on the directories. The files are all accessed by NGINX which knows the correct file name. The main question is, will the latency suffer under glusterfs compared to serving directly from ext4? If this is not the case, I would not care much about sync time. — merlin, Aug 24 '15 at 08:45
It will affect latency, but whether it's significant only you'll be able to tell. :) Plenty of companies and organisations use GlusterFS to host web facing content just fine. Just make sure you perform lots of testing and have a firm understanding on how it works before you go into Production with it. — Gene, Aug 24 '15 at 08:49

How to partition directory system for GlusterFS?

1 Answers1