6

What are the appropriate options to mkfs and mount for an ext4 filesystem with a folder containing >10 million files for read access?

What I have so far:

umount /media/dirsizetest
mkfs.ext4 -L DIRSIZETEST -E lazy_itable_init=1 -E lazy_journal_init=1 -m 1 /dev/sda1
mount -t ext4 -o nodiratime /dev/sda1 /media/dirsizetest

Some context is in order. I'm doing a slightly (OK, very) crazy experiment involving seeing how different file systems perform with a single folder filled with millions of small files. Eventually I'll be filling a 1TB drive to capacity doing this (I told you it was a crazy experiment!).

The access patten is something along these lines:

Recreate the volume from scratch (using mkfs) and mount it.
Create a sub-directory, fill it with N files in sequence (named 1...N)
    (where N can be up to 2^63)
Read all files in order
Read all files in random order
Print how it takes

My natural habitat is the Windows NTFS world, and the number of options to mount and mkfs are a bit daunting. So I'm looking for guidance on what options are likely to shoot my performance in the foot.

I'm working in a 64 bit Ubuntu 12.04 desktop environment.

ligos
  • 181
  • 1
  • 1
  • 5
  • you should over-think your experimental setup due to present time constraints - creating this number of files within a single directory is likely to take a ***very*** long time. Also note that ext4 would only take 2^32 files per filesystem due to the 32-bit inode numbers (you can't have more than 2^32 inodes). – the-wabbit Jul 04 '12 at 07:15
  • 2^32 files is OK if that's the file system limit (FAT32 under Windows is limited to 2^16, but I could still see a pretty clear linear O(n) slow down). And, quite strangely with ext4 and 5M files, creation takes ~0.3ms/file but random access takes ~0.7ms/file (guessing a sequential write vs many disk seeks for reads). And it's getting worse at 10M. Hence my question! – ligos Jul 04 '12 at 22:39

2 Answers2

3

Attention: Security advice

These instructions are UNSAFE and should not be used in production environment without precautions.

For example a battery backed RAID card can help to decrease risks.

Use at own risk


If you just like it as a test environment I'd recommend the ext4 options

noatime,data=writeback,barrier=0,nobh,errors=remount-ro

This

  • disables accesstimes on read
  • writes metadata lazily
  • disables the enforcing of proper on-disk ordering of journal commits
  • tries to avoid associating buffer heads
  • remounts read only on error

for mkfs.ext4 I could only find the option

dir_index
    Use hashed b-trees to speed up lookups in large directories.

usefull

Christopher Perrin
  • 4,811
  • 19
  • 33
  • To anyone who happens to read the above, please understand that these options are NOT SAFE in normal production environments if you are not using storage hardware with battery-backed cache memory as found in most decent RAID cards or external storage array controllers. – ThatGraemeGuy Jul 05 '12 at 10:00
  • Thanks Chris. Most of those options look like they're geared toward improving write performance, while I'm primarily interested in read performance. But they should improve the file creation phase. I noticed `dir_index` reading through the `mkfs.ext4` man page. Wikipedia's ext4 page indicates it's enabled by default in kernal 2.6.23+, so (in theory) I'm already using it (kernal 3.2.0). – ligos Jul 05 '12 at 22:41
  • Noatime should help with read performance because there is no write of accesstime on access – Christopher Perrin Jul 05 '12 at 22:51
  • Yeah, I'd already picked up `noatime`. There was an equivalent in Windows and I wanted to replicate as much as I could, so I went looking for it. I'm waiting for my 10M file run to finish (only a couple more days to go!). Then I'll test your suggestions. – ligos Jul 06 '12 at 06:52
  • Some feedback Chris: I've used the options you suggested and performance improves by ~10% in read and write in my 5M test run. Given the crazy factor of my experiment, I'm pretty happy with 10%! – ligos Jul 09 '12 at 22:53
  • 1
    In conjunction with these great approaches above: Is there no way to hash on the filename, and create a tree of subdirectories? I usually take a sha256 of the filename, take the first 4-6 digits, use them to nest into 2 or 3 subdirectories, then put the file there. This way (1) no directory will ever contain more than 256 files (2) reads and writes are relatively fast, and (3) no listing commands will hang as they do on directories with massive numbers of files. Lots of stuff in the Linux kernel behaves much better when directories are not overpopulated. – cobbzilla Mar 23 '16 at 05:27
1

Some research I did found the following links. Chris Perrin's answer provides a short list of options; these should provide additional reference material.

ligos
  • 181
  • 1
  • 1
  • 5