In AWS-EC2, I've set-up a cluster of linux virtual machines made of an NFS fileserver and many clients. If the number of clients is above ~20, under heavy I/O, I am experiencing loss of file integrity: e.g. gzipped files written by a client to the server are corrupted.
I am wondering what is the best set of NFS parameters to increase reliability of data transfer in this environment.
For now the mount flags are:
Flags: rw,vers=3,rsize=262144,wsize=262144,hard,proto=tcp,timeo=600,retrans=2
The MTU size is 1500, the number of NFS deamons is 8.
Should I decrease rsize & wsize below MTU, and increase the number of NFS deamons?
Is there anything else that can be improved ?
Many thanks.