3

Web-servers typically have a document root denoting the filesystem sub-tree visible via the web. Consequently for eg., if the document root is: /home/foouser/public_html/, then the web-server would map a request for http://www.foo.com/pics/foo.jpg to /home/foouser/public_html/pics/foo.jpg. This results in a series of disk requests to obtain the inode-number of foo.jpg.

Do web-servers do any optimizations to reduce the number of disk accesses (or) is it the role of the server-admin to set the document root as close to "/" as possible, to reduce the number of disk-accesses in the filename to inode number translation?

Avinash
  • 31
  • 1
  • I strongly doubt it. First of all, the gain would be minimal - inode lookup is probably negligible compared to network latency and transfer times. Second, web servers usually have to be portable across platforms and who knows how many filesystems; I can't think of a single straightforward way to optimize this lookup that would be correct, and work across filesystems. (What if the file / directory holding it is deleted and replaced between requests?) There's much more to be gained from switching to SSDs, CDNs, and caching, than from fiddling with the FS at this low a level. – millimoose Dec 09 '12 at 22:50

1 Answers1

0

I know this isn't directly the answer to your question, but by setting up a caching strategy you can drastically reduce disk reads. Especially if your static content is not hosted on your server.

Options:

  • Host static content on a CDN:
    • Pros: Off-load all load onto someone else's network. Cost?
    • Cons: Potentially less control. Cost?
  • Use Contendo/Akamai, which is also a CDN, but with some differences.
    • Pros: Host your content, but after the first read the cdn will handle caching based on the headers you send with your content (static or not)
    • Cons: Sometimes headers are really annoying to manage. Cache busting (breaking your own cache) can be annoying to handle when you want to replace old content.
  • Cache things locally. If you are making a DB request for instance you can cache the request. Next time your code is run check your in memory cache first (as opposed to make a db request immediately). You could cache entire pages then at an application controller/route level check if there is a cached version of the page/asset and serve that.
    • Pros: Lots of control. You can cache almost anything.
    • Cons: A ton of work to set up caching on every little thing. You need a strategy for every part of your website.

My recommendation is to start out by moving your assets to AmazonS3 or Rackspace or something. Joyent has something for this as well. You could then enable cloudfront for s3 which will turn on the cdn, which caches things in various regions. This is a really cheap solution (depending on the amount of files you have).

You could also go the contendo route.

The caching on the application side route takes quite a bit of work and completely depends on your server/language/db/configuration.

Parris
  • 17,833
  • 17
  • 90
  • 133
  • If the OP cares about performance this much, they can probably afford the CDN for static assets. It's probably a better performance improvement than caching a static asset locally. (Of course for database data or dynamic output caching is invaluable.) – millimoose Dec 09 '12 at 22:41
  • @millimoose yea, just thought I'd point it out. If you are introducing something like this at a company then you need to get approvals for spending then make estimates about monthly costs, which in the case of CDNs (especially amazon) can be relatively surprising. The pricing scheme is extremely micro. They charge for bandwidth, storage and I think a few other things. – Parris Dec 09 '12 at 22:51