I manage an application that contains a filestore in which all the files are stored with the filenames equal to their md5 sums. All files are stored in one directory. Currently there are thousands, but soon their should be millions of files on the server. The current server is running Ubuntu 11.10 on an ext4 filesystem.
Someone told me that it is not wise to put many files in a directory, as this will create significant increase in lookup time and reliability (he had a story about max files a single dir could point to, resulting in a big linked list). Instead he suggested to create sub directories with e.g. substrings of the filename. However, this will make some things in my application much more cumbersome.
Is this still true, or do modern filesystems (e.g. ext4) have more efficient ways to deal with this and naturally scale? Wikipedia has some details on filesystems, but it doesn't really say anything about max files per directory, or lookup times.