0

Talking about a huge (50GB, 500.000 entries) Apache disk_cache partition: Which Linux filesystem performs best for this task?

In my example, the partition has a lot (500.000) of very small files (< 1 KB) and a lot (500.000) of files with ~ 50 KB.

File hierarchy is as deep as /htcache/B/x/i_iGfmmHhxJRheg8NHcQ.header.vary/A/W/oGX3MAV3q0bWl30YmA_A.header.

Typical actions are creating directories and files (by Apache), reading files (Apache) and removing files and directories (htcacheclean).

I'm currently using ext3 and I'm facing bad performance (i.e. slow ops with high IOwaits) when purging outdated files and emptied directories from the cache.

  • the ext3 filesystem was created with "-t news" (i.e. blocksize = 4096, inode_size = 256 and inode_ratio = 4096).
  • filesystem features: "has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file"
  • the partition is mounted with "noatime"
flight
  • 394
  • 4
  • 14

1 Answers1

1

I wonder if using a non-journalling FS is an option (or turning journaling off in ext3). You could also try some ext3 tuning options - like noatime - tune2fs could help.

Btrfs is the new buzzFS at the moment... from the benhcmarks I've seen, it is comparable to (or better than) ext4. If I were starting with a new system, I'd prefer (tuned) ext4, then maybe btrfs, then (tuned) ext3 for may small files. I'd hesitate about ext2: old and stable and has no journalling, but haven't really seen it compared to the current ext3/ext4/btrfs.

You probably should not not go with XFS for many small files.

chronos
  • 578
  • 5
  • 13
  • I'm already using noatime (question edited, thanks!). I thought that ext3 without journal is almost identical to ext2? Regarding tune2fs, I looked into that, but saw nothing else to tune besides removing the journal (see my filesystem features in the edited question). Any hints? – flight Oct 13 '11 at 13:02
  • No, sorry - I lack experience beyond that. Yes, ext3 with no journal is ext2, but I don't know how journalling impacts performance in your case. You mentioned slow performance during purge - maybe you could use `ionice -c 3` on the purge cron job? That could really help, at the cost of slower purge. As you most likely understand yourself, testing a few options under your specific load would be best - provided you have time and resources to do that. – chronos Oct 13 '11 at 22:11
  • I removed the journal (`tune2fs -O ^has_journal`) and mounted the partition as ext2. Didn't help, the IOwaits even went further up. Luckily, `tune2fs -O has_journal` reinstalled the journal, so that I'm now back to ext3. – flight Oct 14 '11 at 09:07
  • Thanks for the update, I'm really curious what is the best FS option for your case. This post - http://permalink.gmane.org/gmane.comp.db.cassandra.user/8478 - suggests that ext4 might be superior to ext3 with high inode counts. Seems at least partially relevant. – chronos Oct 14 '11 at 22:32