0

Is there an optimum number of directories to hold images on a drive before grouping into sub-directories.

Example, I have a collection of approximately 600,0000 image files I can logically sub-group these into several layers but I'm not sure of the optimum for fastest retrieval. I dont need to search the disk because I will always know its absolute path.

My basic options are:

1 directory with 600,000 files (my instincts tell me this is no good!) OR 1 directory with 1500 sub-directories each with an average of 400 files (min 200 max 600) OR 1 directory with 75 sub-directories each with an average of 20 sub-directories with an average of 400 files in each.

The second scenario would be my ideal choice but am concerned that this number of sub-directories will affect performance.

Discuss please !

Roger

RogerDodge
  • 100
  • 1
  • 7

2 Answers2

0

In my experience this is filesystem (and even storage vendor) dependent...with the exception that choice #1 ("Just dump everything in one place") is almost certainly going to be a poor performer.

We faced a similar problem and went with variant of #2. In our case, we had tens of millions of users, each with somewhere between 10 and ~1000 files. We ended up with a structure that looked like this:

ab\cd\ef\all_the_files

The ab portion specified the mount point, and cd\ef were the two levels of sub folders underneath.

If you're going to be seeing significant IO load I'd urge you test our your configuration on the hardware and network you're going to be using at scale. And, of course, give thought to how you can do backups and restores of portions of data, if required.

Will Chesterfield
  • 1,780
  • 12
  • 15
0

This previous question favours flat files on NTFS after experiments. This makes sense, since modern file systems will store directory contents in a structure with logarithmic search times, so you get to choose between log(n) and something that is >= 2 log(sqrt(n)) - or at best equal.

Community
  • 1
  • 1
themel
  • 8,825
  • 2
  • 32
  • 31