0

I have this architecture on Amazon EC2, one NFS server and one NFS client, on the client I am serving PHP and Django websites (nginx, uwsgi, php-fpm) and they work perfectly.

I am experiencing an issue when I spin up another NFS client instance base on the image of the first NFS, when I load a PHP site(wordpress), I start to get timeouts on the browser. And when I turn off one of the NFS client instance and things start to work again. I am suspecting there is a file lock problem, I have tried all night, been searching on google and tried the nolock option but I just couldn't solve it.

What I saw was, the NFS mounted folders seemed fine and showing all the files, but when I attach the second EC2 instance, the NFS server and both clients started to get high load average, with very low CPU usage.

Here is the content from /etc/export on the NFS server

/export/www 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/nginx/sites-available 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/nginx/sites-enabled 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/uwsgi/apps-available 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/uwsgi/apps-enabled 172.0.0.0/8(rw,async,no_subtree_check)

And here is the content from /etc/fstab on the NFS clients

LABEL=cloudimg-rootfs   /        ext4   defaults        0 0
/dev/xvdb       /mnt    auto    defaults,nobootwait,comment=cloudconfig 0       2
#172.31.0.62:/export/www        /var/www        nfs     auto    0 0
172.31.0.62:/export/www /var/www        nfs4    rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/nginx/sites-available /etc/nginx/sites-available     nfs4    rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/nginx/sites-enabled  /etc/nginx/sites-enabled        nfs4    rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/uwsgi/apps-available /etc/uwsgi/apps-available       nfs4    rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/uwsgi/apps-enabled /etc/uwsgi/apps-enabled   nfs4    rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2

Thanks heaps.

UPDATE:

Looks like it's not only related to PHP FPM, I can even replicate this by refreshing a static html page. Whenever the server starting to get stuck, running nfsstat shows calls and authrefrsh goes up very quickly.

James Lin
  • 25,028
  • 36
  • 133
  • 233
  • Hmm... this post might be better on [sf]. It doesn't seem to be related to programming but rather to server configuration. – Lix Nov 17 '13 at 07:07

1 Answers1

0

There was a problem with NFSv4 on Amazon EC2, I don't know why, but the sys admin I hired told me he has heard NFS problems on EC2 as well. What he discovered was NFS concurrent read speed was very very slow, something like 20MB over 150 seconds, while write speed was pretty ok @ 7mb/s

So the real fix was to drop back to NFSv3 and everything started to work fine again.

Hope this will help someone who is having similar issue.

James Lin
  • 25,028
  • 36
  • 133
  • 233