0

I have a GIT repository in a ~/foo folder. Now, created a bar/ folder there with some contents, under ~/foo/public/bar/. It is correctly recognized by GIT as untracked:

~/foo git status -s
?? public/bar/

I've always used git clean -fd to delete untracked folders, but it doesn't work for some reason. When I run it, nothing happens:

~/foo git clean -fd
~/foo git status -s
?? public/bar/

Has something changed in GIT or am I missing something? I'm using GIT 2.32.0.

Robo Robok
  • 21,132
  • 17
  • 68
  • 126
  • 2
    Could you [edit] in the output of `git status --untracked-files=all --ignored` in case there are any additional clues there. – IMSoP Jun 14 '21 at 17:31
  • @IMSoP mystery solved! There was another GIT repository inside `public/bar` and it looks like `git clean -fd` doesn't touch nested repositories. – Robo Robok Jun 16 '21 at 10:31

1 Answers1

0

Folders themselves are entirely uninteresting to Git: only files are untracked (because only files are ever tracked either).1 So when git status says:

?? public/bar/

it is hiding something. What it is hiding is the fact that there is at least one file underneath public/bar/ somewhere.

Running git clean -fd won't necessarily remove this untracked file. In particular, without -x or -X, git clean avoids removing any ignored-while-untracked files.

As IMSoP mentions in a comment, using git status --untracked-files=all --ignored would get us more information. We would see the names of the various files within public/bar/. What git status does here is note that there's no need to announce each file, one by one, when it can just summarize that there are multiple files by announcing the containing directory public/bar/ (with trailing slash).


1This is not quite true, because of submodules, but it's close enough to let one think about the problem. Also, the git clean documentation talks about "untracked directories" under the description of the -d option, so what does this even mean? The clue is in the description of of the -x and -X options:

-x
      Don’t use the standard ignore rules ...

There's something missing here in general, which is how the ignore rules work. Even the gitignore documentation, which this refers to in the section I snipped, doesn't cover the key detail, which is:

  • To find untracked files, Git must walk the trees of files within your working tree. That is, it has to peer into each directory (or folder, if you prefer that term) to see what files exist within that directory.
  • Walking file trees—opening and reading directories to get a list of files, then recursively opening and reading every sub-directory—is slow.
  • Gitignore rules—the lines in the .gitignore file that ignore particular patterns—can list directory names, either explicitly with trailing slashes, or implicitly because a directory name matches some gitignore line.
  • So, if Git can determine in advance that every file within a directory—say, public/bar/would be ignored, Git can simply avoid opening that directory and reading it. There's no point: everything Git finds here would be ignored!

The shortcut in the last bullet point saves time. In a large build, on typical modern systems, it can save literally seconds in a git status run (sometimes tens of seconds, or even greater orders of magnitude). So both git status and git clean take advantage of this when possible. When they have to enumerate the actual files within some everything-will-be-ignored directory, though, they still have to open and read the directory.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Hey torek, thank you for your detailed answer and sorry for late reply. So, I found out what was wrong. Let me start from saying that if all filed were about to be ignored, `git status` wouldn't list that folder at all. In fact, after some trial and error I learned that `git clean -fd` didn't work, because there was another `.git` folder in this folder! So it turns out that it doesn't remove nested repositoties, but I'm not sure why. – Robo Robok Jun 16 '21 at 10:29
  • Aha. Usually you'd see the `public/bar/` thing when there exists one or more ignored files *and* one or more un-ignored files. You would also see it in some other special cases. The sub-repository case is one of those. Note that `git clean` has `-f -f` to force it to recurse into such sub-repositories, but using this is rarely a good idea. Look up submodules, which get rather complicated. – torek Jun 16 '21 at 14:17