10

I have a website that is getting about 7k requests per seconds on a nginx server. This server both handles rewrites to an Apache server as well as serving static files, images, etc. directly. Static files is the biggest part in there with about 5k requests.

With an architecture upgrade, I think about using a central file server that exports a directory containing these static files via NFS. There will be no write-access to these files, so the directory could be mounted read-only on the nginx machine. My main concern is:

Is NFS fast enough for this? Is there a limit on how many requests NFS can handle? Are there some "must-have" options when going this way?

Bonus: are there other alternatives for this setup besides NFS?

Thanks!

j0nes
  • 955
  • 1
  • 12
  • 27
  • No read-access, or "only" read-access? – pauska Nov 15 '11 at 14:12
  • Thanks! Only read-access, no write access. – j0nes Nov 15 '11 at 14:14
  • 2
    7k requests a second is ~6,048,00,000 requests a day, if you're running that on a single NGINX server we (And the companies that use clusters of servers for half that load) would love to know your setup. –  Nov 15 '11 at 14:44

4 Answers4

3

I use cachefilesd (and a recent linux kernel, with cachefs) to cache NFS files to a local HD. This way, every read in the nfs will copy the file to a /var/cache/fs dir and next reads will be delivered from there, with the kernel checking in the nfs if the content is still valid.

This way you can have a central NFS, but without losing the performance of local files

Cachefilesd will take care of the cleaning of old files when the free size/inodes reach a configured level, so you can serve uncommon data from the NFS and common requests from the HD

Of course, also use a varnish to cache the more common requests and save the nginx/NFS from serving then.

Here is a small cachefilesd howto

higuita
  • 1,173
  • 9
  • 13
2

By setting up a central NFS server you introduce a single point of failure into your design. That alone should be a deal breaker. If not, NFS can be plenty fast enough for a load like this. The critical factors will be having enough RAM to cache files, low latency interconnections (Gig-E or better), and tuning (less so than the previous).

You should also strongly consider using rsync or a similar tool to keep local copies of the static files updates on each individual webserver. Another option might be a SAN or redundant NFS server solution (both of which are going to be much more complicated and costly than the rsync idea).

Chris S
  • 77,945
  • 11
  • 124
  • 216
  • 2
    NFS doesn't have to be a SPoF – gWaldo Nov 15 '11 at 14:35
  • @gWaldo How exactly can you setup "a central NFS server" and have it not be a SPoF? – Chris S Nov 15 '11 at 14:39
  • You do it by realizing that, as you said, a *central* NFS server is a SPoF, and instead choose to implement a NFS cluster. I'm not actually disagreeing with you.... – gWaldo Nov 15 '11 at 14:46
  • Thanks - accepting this solution because I think I will go the rsync route, avoiding the single point of failure thing (which should be my main concern). – j0nes Nov 18 '11 at 17:42
  • It's fairly simple to implement a high-availability dual replicated NFS server using GlusterFS and CTDB. In terms of performance my cluster was receiving about 10k request per second and it's cracking along just well. Only problem is that the NFS servers will need have lots of RAM. – Alpha01 Jul 31 '15 at 08:20
1

The speed depends on many factors:

  • How are your servers going to be connected against the NFS target? A single dual-port SAS disk can utilize 6gbit/s of transfer speed. Keep this in mind if you're planning to use 1gig Ethernet (which you can subtract 20% TCP overhead from).
  • What kind of cache is the NFS server going to get? Are you using a enterprise grade array controller with lots of cache? Read cache is key in this setup
  • How many servers are going to access the same file simultaneously? NFS locking can hurt - badly

The limit of open files via NFS is a limitation of the host operating system. FreeBSD has for example many different tuning options to support a large number of open files, but it depends on the amount of RAM in your server.

An alternative to a central file server is to use synchronization/replication between your web servers (like Chris S suggests). rsync or DRBD might be a great and cost-effective choice.

pauska
  • 19,620
  • 5
  • 57
  • 75
1

I would advise against NFS unless you put some Caching in there. The nginx cache is better than nothing, but Varnish is better.

With that said, if your load would change to be more dynamic content than static, it will become more important to serve apps files from local disk.

If you put NFS in, make sure that you have redundancy.

gWaldo
  • 11,957
  • 8
  • 42
  • 69
  • This may require a bit of a architecture change. In addition to using a caching layer like Varnish, it is also a good idea to use an origin pull CDN setup for all the static files that will be on the NFS share. This will alleviate the load hitting the NFS backend. – Alpha01 Jul 14 '15 at 17:14