6

I'm curious, from a performance standpoint, is there an advantage in storing all files in one directory versus having each file in a separate directory? I'm not concerned about organization.

Also, this is assuming the files will be accessed often -- so I/O usage will be high. No directory listing is involved, the files will be pulled by absolute path.

The system environment is Linux, CentOS 5.3.

locke92
  • 63
  • 4

4 Answers4

6

Path resolution is proportionate (though not linearly so) to the number of files in the directory. This is true even for resolving absolute paths because the file system still needs to scan the file names in each directory block to resolve the path. Different file systems have different resolution characteristics but, in general, you will start noticing the performance hit around 10,000 files.

dvogel
  • 301
  • 1
  • 2
  • 4
0

Unless the directories are on different disks or RAIDs, then you won't see a noticeable difference, if they are all in one directory or not. The I/O operations for each disk are put together in one queue. If they are on different RAIDs, then you'll see the noticeable advantage you are looking for.

Malnizzle
  • 1,441
  • 2
  • 16
  • 30
0

If you've got enough memory to hold all the files, have you considered caching them in RAM? http://www.linuxmaza.com/system-administration/how-to-mount-ramfs-tmpfs-in-linux/

Matt Simmons
  • 20,396
  • 10
  • 68
  • 116
  • 1
    if they're so frequently accessed, they'll be cached by the OS – Javier Jul 02 '10 at 14:25
  • As long as you have enough memory free, Linux will cache your most recently used files in RAM, creating similar performance as with a ram disk. – tylerl Jul 03 '10 at 09:51
  • Yes, but it's possible that the contents of cache are emptied when it does something else, for instance, updatedb runs and sucks up memory for holding metadata. A ram disk guarantees that the memory is used for what you want – Matt Simmons Jul 03 '10 at 13:28
0

ext3 does some nice things:

http://www.ibm.com/developerworks/linux/library/l-fs8.html

See the section header: Journaling options and write latency
This allows you to "tune" ext3 for your application.

jim mcnamara
  • 429
  • 3
  • 8