One software we developed generates more and more, currently about 70000 files per day, 3-5 MB each. We store these files on a Linux server with ext3 file system. The software creates a new directory every day, and writes the files generated that day into this directory. Writing and reading such a large number of files is getting slower and slower (I mean, per file), so one of my colleagues suggested opening subdirectories in every hour. We will test whether this makes the system faster, but this problem can be generalized:
Has anyone measured the speed of writing and reading files, as a function of the number of files in the target directory? Is there an optimal file count above which it's faster to put the files into subdirectories? What are the important parameters which may influence the optimum?
Thank you in advance.