8

We have an NFS mount on a RHEL6 VM that supports our version control server - recently, one of the repositories went a bit crazy and this is what I found on the server:

ls -latri repo.git/refs/heads/

total 28
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21 .
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21
5551209 drwxr-xr-x. 3 git git 4096 Jun  1 22:09 ..

When I run tree against the dir, it appears to be infinitely recursive - e.g.:

repo.git/refs/heads/
├──
│   ├──
│   │   ├──
│   │   │   ├──
│   │   │   │   ├──
│   │   │   │   │   ├──
│   │   │   │   │   │   ├──
│   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   │   │   │   │   │   ├──
│   │   │   │   │   │   │   │   │   │   │   │   │   │   │   │   ├──

I have attempted to delete the repository via its inode ref:

[root@node repo.git/refs]# ls -latri
total 16
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21 heads

[root@node repo.git/refs]# find . -inum 5551210 -exec rm -rf {} \;
rm: cannot remove `./refs/heads': Directory not empty
find: `./refs/heads/': No such file or directory
find: `./refs/heads/': No such file or directory

I am a bit at a loss what to do here - the inode info on the ls -latri command seems to indicate that there are 2 directories in the 'heads' directory that are hardlinks to the heads directory?

Any ideas on how to clean this up would be most welcome - think I have solved the application issue it was causing but the bigger issue with the filesystem needs to get sorted.

Thank you!

Edit: bit of additional output:

no hidden characters:

[root@node repo.git/refs]# ls -latrib heads/
total 28
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21 .
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21
5551209 drwxr-xr-x. 3 git git 4096 Jun  1 22:09 ..

but here is some fun output when I'm actually in the heads dir:

[root@node repo.git/refs/heads]# ls -latrib
ls: cannot access : No such file or directory
ls: cannot access : No such file or directory
total 12
      ? -?????????? ? ?   ?      ?            ?
      ? -?????????? ? ?   ?      ?            ?
5551210 drwxr-xr-x. 2 git git 8192 Jun  1 21:21 .
5551209 drwxr-xr-x. 3 git git 4096 Jun  1 22:09 ..
oldNoakes
  • 183
  • 6
  • Your `ls -latri` output is odd as the link count for inode 5551210 is odd if there are those two extra directories. Could you try `ls -latrib`? What's the underlying filesystem type? – Paul Haldane Jun 01 '17 at 12:50
  • Hey, file system type is nfs4 - the output with the -b flag is the exact same as without - have added what info I could above – oldNoakes Jun 01 '17 at 13:05
  • Have you looked at the problematic directory on the NFS server (the server that your version control VM is mounting the filesystem from)? I think you need to see what it thinks is happening (and it was the filesystem type on the NFS server that I was asking about). – Paul Haldane Jun 01 '17 at 13:08
  • Unfortunately no access to it - corporate environment and such - will raise a request to get someone in there to have a look on their side. If I can get more info I will post it up, cheers! – oldNoakes Jun 01 '17 at 13:22
  • 2
    Is the file system intact? Those question marks in the `ls` output are suspect to me. Have you run fsck on the NFS server? – Lacek Jun 02 '17 at 07:57
  • We want to but is not that simple - the fileserver is serving our git setup for a large enterprise - we have to schedule some downtime to unmount it and verify the fs. Will update if we ever make it happen :( – oldNoakes Jun 02 '17 at 11:20
  • Have you tried mounting it on a second location as read-only? – cEz Jun 02 '17 at 11:31
  • 3
    I strongly recommend to do a **fsck**... in particular, before you see any further corruption. – Has QUIT--Anony-Mousse Jun 03 '17 at 11:31
  • @oldNoakes: After you've scheduled a downtime schedule the setup of a backup server. What would you do if it broke down *now*? – Martin Schröder Jun 07 '17 at 18:16
  • still haven't gotten to the bottom of this (no one wants to allow the downtime on the system) but we have a regular backup of the relevant application data on the drive that continues to be successful (and is tested into a staging instance nightly). If I ever find out what is going on, I will be sure to update, just no idea when... – oldNoakes Jun 08 '17 at 11:02

1 Answers1

3

First: Git can neither be the cause nor the solution of a problem that manifests as nonsensical output from ls. Stop using Git or other tools on the filesystem and unmount it to avoid harm.

This looks like either a broken filesystem or a broken mount. Try unmounting and remounting the filesystem on the client. Try fully rebooting the client. Try doing the same mount on another client. Each time, check that ls output to see if it becomes normal. This will help you diagnose whether the problem is on the NFS server side. If the ls output continues to look the same, investigation and repair of the filesystem (fsck or whatever) and/or the NFS service (restarting NFS-related daemoons; reboot if nfsd is in-kernel) needs to take place on the server side.

ruief
  • 313
  • 2
  • 7