2

Suppose we have one level of subdirs named with 2 hexadecimal digits, inside of those, another level named with 3 hex digits (or vice versa) and the files are being distributed rather uniformly using a hash function.

So it should look like this

/uploads/c2/f4c/image.jpg

or maybe like this

/uploads/c2f/4c/image.jpg

Mathematically, 256x4096 and 4096x256 will give me the same final number of folders for the distribution of my files. In terms of performance of the filesystem, is there a difference?

Thank you!

edit: the filesystem in question is ext4

  • It depends on whether you want to keep the total number of directories smaller, or the access speed. Assume a non-full usage, i.e. only “a few” will be hit. Additionally, do be aware that ext2fs has limits, such as the 31998-entry limit on directories for subdirectories. Other filesystems may have other limits. – mirabilos Dec 27 '13 at 13:48
  • 2
    i'd first put the smaller of the 2, it's less overwhelming for someone casually looking into the /uploads directory. – Olivier Dulac Dec 27 '13 at 13:58
  • I'm interested in the access speed and not so much caring about keeping the number of directories smaller. I'm aware of such things as limits of subdirs and the system being slower for having a huge amount of subdirs even if it's not near the limit yet, which is why I opted for dividing the tree in levels. Is this correct? – Etiene Dalcol Dec 27 '13 at 13:59
  • 1
    think also that each file (and that includes your directories!) in a partition will take up inodes [well, if you use a standard filesystem]. You'll need to specially create the partition that will hold that hierarchy (+its contents) so that it has more than the usual number of inodes, probably. – Olivier Dulac Dec 27 '13 at 14:17
  • 3
    I don't think that one of the 2 setups will be **faster** than the other. But I prefer the "256x4096" one for the reason stated above. You may want to set "noatime" (or any equivalent on the filesystem you end up using), and also choose carefully the partition type [and its parameters, ie the number of allowed inodes, etc]. You may also want to make sure it's not parsed by "slocate" (or any equivalent you're using), if it would save time and db space [but at the cost, of course, of not listing its content via "locate/slocate"] – Olivier Dulac Dec 27 '13 at 14:19

0 Answers0