I have this architecture on Amazon EC2, one NFS server and one NFS client, on the client I am serving PHP and Django websites (nginx, uwsgi, php-fpm) and they work perfectly.
I am experiencing an issue when I spin up another NFS client instance base on the image of the first NFS, when I load a PHP site(wordpress), I start to get timeouts on the browser. And when I turn off one of the NFS client instance and things start to work again. I am suspecting there is a file lock problem, I have tried all night, been searching on google and tried the nolock option but I just couldn't solve it.
What I saw was, the NFS mounted folders seemed fine and showing all the files, but when I attach the second EC2 instance, the NFS server and both clients started to get high load average, with very low CPU usage.
Here is the content from /etc/export on the NFS server
/export/www 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/nginx/sites-available 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/nginx/sites-enabled 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/uwsgi/apps-available 172.0.0.0/8(rw,async,no_subtree_check)
/export/config/uwsgi/apps-enabled 172.0.0.0/8(rw,async,no_subtree_check)
And here is the content from /etc/fstab on the NFS clients
LABEL=cloudimg-rootfs / ext4 defaults 0 0
/dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2
#172.31.0.62:/export/www /var/www nfs auto 0 0
172.31.0.62:/export/www /var/www nfs4 rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/nginx/sites-available /etc/nginx/sites-available nfs4 rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/nginx/sites-enabled /etc/nginx/sites-enabled nfs4 rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/uwsgi/apps-available /etc/uwsgi/apps-available nfs4 rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
172.31.0.62:/export/config/uwsgi/apps-enabled /etc/uwsgi/apps-enabled nfs4 rw,noatime,nodev,async,hard,intr,rsize=32768,wsize=32768 0 2
Thanks heaps.
UPDATE:
Looks like it's not only related to PHP FPM, I can even replicate this by refreshing a static html page. Whenever the server starting to get stuck, running nfsstat
shows calls
and authrefrsh
goes up very quickly.