0

I recently ran into an issue with nfs on my nfs clients related to one specific shared directory. None of the applications I use (Sonarr & Plex) which are both on different VMs are able to read the contents of a certain directory and when they try the contents of the directory disappear across all VMs (all sub directories and files).

If I am logged in via SSH and am looking at one of the sub directories in this troublesome directory after trying to read the file I get 'Stale file handle' warnings. When the directory is working I am able to view all the files with ls and navigate around the directories in bash without issue. The really odd thing is that all other directories on the nfs mount are completely fine and both applications just work with those directories. On the client side nfs uses v4.

For my set-up I have a host server (running CentOS 7 - 3.10.0-693.21.1.el7.x86_64) with 6 data drives in it, all pooled together using MergerFS. I am using NFS to export that merged directory. On the same server I have a variety of VMs which mount the NFS share. The nfs exports are mounted on the clients in /etc/fstab with <ip_addr>:/ /media nfs4 rsize=32768,wsize=32768,intr,noatime,bg 0 0

I have tried a variety of things to debug this:

  • I have used lsof on repeatable mode (lsof -Nr I think) to monitor the NFS share though saw no access from mono when I pressed the Update series button (Sonarr runs through mono).
  • Turned on nfs debugging on client side but didn’t really get any useful information, though not 100% sure what I am looking for though.
  • I did turn on nfs debugging on the server end but only got 1 extra line in messages file which was moaning about no hostname.
  • I also tried a strace but again couldn’t see anything useful.
  • Used smartmontools to check my hard drives for any errors, they all passed
  • nfsstat also reports no retransmitted data.
  • Moved the directory that the troublesome media is in, I have also since completely removed the troublesome content and re-uploaded it

When this odd disappearing act happens, unmounting and remounting will solve it and sometimes touching a couple of files in the same directory on the client and server can cause the files to show again. I initially though this issue was related specifically to Sonarr (In Sonarr it attempts to read a single specified directory) but as Plex in a seperate VM is having trouble then I believe it is an issue in Linux more than the applications themselves.

Does anyone have an idea of what could be causing this odd behaviour or can offer any assistance in debugging this issue. If it is helpful the contents of this troublesome folder are 50.6 GB spread through 283 files. I did just try deleting most of the files to see if that helped but no, issue still occurs.

Thank you

Richard Bale
  • 113
  • 6

2 Answers2

0

Preliminary remark: not sure for your system, but on mine I currently have in /media several dirs. To avoid interferences, I currently use a differnt mount point .

I would suggest that you do that for your server mounting point. And of course to adapt accordingly on the client machines

Possible soure of confusion: I once ran into problems because the mount points defined in the mount instracutions had different handling of the mount point name: some were with trailing slash (eg, /media/ ), others without it (eg, /media ) and so they were not in fact looking at the same place and so was confronted to a problem similar to what you express.

I guess you did not make the same mistake, but check just in case!

Fibo
  • 21
  • 3
  • Thanks for the comment, this isn't the issue (I just worked it out and will post an answer in a moment). But I only have a single mount point in the VMs for /media and they all use the same and not experienced any issues in any other systems. – Richard Bale Oct 28 '18 at 19:02
0

So feel like a bit of a dunce but worked out what the issue was, turns out to be nothing to do with NFS, well at least to my knowledge.

The issue was that the files I was trying to read in Sonarr had the execute bit set. Running chmod 644 * solved my problems. Execute was set from some previous mass renaming I had done.

I am not sure why the files being 755 caused that issue but fixing the file perms solved it.

Richard Bale
  • 113
  • 6