1

After several hours of data transfer to a FAT partition using an NFSv3 server the server has no free memory left.

What happens:

  • For each NFS Write command the NFS daemon opens the file, writes the data received and release the file.
  • When the FAT partition is mounted with flush option there is a call to a function named congestion_wait on each file release. This function can wait for up to 100ms.
  • The kernel we are using is version 3.16 and we didn't have the problem when we were using the version 2.6.37. I discover that one of the differences between them is that from version 3.6 the fput function (called by nfsd) use a work queue to release the file.

The problem is that it is possible that the NFS daemon have to process more NFS Write commands than the FAT file system can release file. The work queue can grow until the memory is full. In our case the memory fills at 100MB/hour with a transfer rate of 50Mbits/s.

I am looking for a way to avoid this issue and I am thinking of reducing the congestion_wait timeout from 100ms to 10ms.

Does anybody knows why 100ms was choosen and if it is safe to reduce this value?

FYI the FAT file system flush option was introduced by the commit https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/fat/file.c?id=ae78bf9c4f5fde3c67e2829505f195d7347ce3e4.

0 Answers0