0

I'm maintaining some git pre-commit hooks, and I keep wanting to do something for all files that are or should be under revision control. Knowing the project structure lets me do a decent job of this, but all the info about build system output directories, test log files, and editor droppings is already in .gitignore.

Is there a simple way to filter file paths based on whether they match a pattern in .gitignore.

Iow, what can I substitute for WHAT GOES HERE in

find "$(git rev-parse --show-toplevel)" --my --filters | WHAT GOES HERE

so that I get all and only un-gitignored files that match my filters.

I think I can get a negative filter that I might tee into comm by doing

... | xargs git ls-files -X .gitignore -i

but I was hoping for a single step.

Mike Samuel
  • 118,113
  • 30
  • 216
  • 245
  • Possible duplicate of [Git command to show which specific files are ignored by .gitignore](https://stackoverflow.com/questions/466764/git-command-to-show-which-specific-files-are-ignored-by-gitignore) – phd Jun 18 '18 at 14:37

1 Answers1

3

UPDATE - As noted in an exchange of comments on this answer, the check-ignore command says that it lists ignored files, but in the event your ignore rules include exceptions (patterns that start with !), files matching those patterns are printed as well even though the file is not ignored. While some of the docs can be read as describing this behavior, other parts of the same docs strongly imply that it's not what's intended - so I regard it as a bug, but regardless of such interpretations, it's how the software works.

So... If you don't use ! patterns, the below works as advertised. If you do use ! patterns, then you could work around this by using --verbose output and post-processing to see if a matching pattern is an inclusion or an exclusion.


Getting the exact behavior you want with ls-files may not be as easy as it seems. To start, you probably don't mean -i since that would only list ignored files...

But anyhow, a different (more "one-step") approach would be:

In your find command, you can use an -exec action to call git check-ignore for each file matching your other filters.

find "$(git rev-parse --show-toplevel)" <filters> -not -exec git check-ignore -q {} \; <actions>

This will properly interpret the ignore rules from all sources.

By default that also means that if a file is in the index, it does not show up as "excluded" even if it's in .gitignore, which reflects how ignore rules really behave.

But if you want to not process files matching the ignore pattern even though they're in the index and therefore are not really ignored, you can modify the command to do that:

find "$(git rev-parse --show-toplevel)" <filters> -not -exec git check-ignore -q --no-index {} \; <actions>

Since you started from using find, I'm assuming you only care about files that are currently in your work tree in any case.

You may also want to exclude the .git directory. If .git is the only "dot-file" in your top-level directory, you could say

find "$(git rev-parse --show-toplevel)"/* <filters> -not -exec git check-ignore -q --no-index {} \; <actions>

If you can't make that assumption, then you could

find "$(git rev-parse --show-toplevel)" -path "$(git rev-parse --show-toplevel)"/.git -prune -o <filters> -not -exec git check-ignore -q -no-index {} \; <actions>

which is a bit ugly due to the two calls to rev-parse. You could instead capture the rev-parse result to an environment variable before running find, but that may run afoul of your "one step" preference. Another option, if you can safely ignore any directory named .git

find "$(git rev-parse --show-toplevel)" -path */.git -prune -o <filters> -not -exec git check-ignore -q -no-index {} \; <actions>
Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • Thanks. `git check-ignore` is the piece I was missing and thanks for showing the particular flags that help it work as part of a negative `find` predicate. – Mike Samuel Jun 18 '18 at 15:34
  • Good answer, but to be more explicit : be wary that the command will display files that matches the **ignore pattern** (which the answer says), not if they are ignored by `git add` or not (which the OP seems to want); if the rules contains *.html and !foobar.html, the foobar.html would still be displayed. – NoDataFound Jan 24 '19 at 17:03
  • @NoDataFound - That's interesting; I hadn't noticed it in my tests. It's also contrary to the documentation, so it's almost certainly a bug. – Mark Adelsberger Jan 24 '19 at 19:54
  • The documentation says it: https://git-scm.com/docs/git-check-ignore _By default, any of the given pathnames **which match an ignore pattern** will be output, one per line. If no pattern matches a given path, nothing will be output for that path; this means that path will not be ignored._ However the header is misleading: _For each pathname given via the command-line or from a file via --stdin, **check whether the file is excluded by .gitignore** (or other input files to the exclude mechanism) and output the path if it is excluded._ – NoDataFound Jan 24 '19 at 20:20
  • Note that I have tested the command with several files (because I wanted to remove them from my tree using `git filter-branch`) using `find -type f -print0|xargs -r0 git check-ignore --no-index -v`, not one as you are doing. This might differ. – NoDataFound Jan 24 '19 at 20:21
  • @NoDataFound - Based on that snippet from the docs (and the paragraph that follows it) I ran an additional test, and found that even if the *only* rule in the .gitignore is !foobar.html still check-ignore will output the path as "matching a pattern". This means that for all intents and purposes, if your ignore rules include exceptions, the command is practically meaningless unless you use `--verbose` (so you can tell whether the match is exclusive or inclusive). So I'm standing by my analysis that it's a defect. (The inconsistent docs suggest it is a design error.) – Mark Adelsberger Jan 24 '19 at 21:32
  • @NoDataFound - I would add that the fact (and stated reasoning) that by default nothing in the index is output (unless you say to ignore the index) further supports that the commands *intent* is what the top line of the documentation says - to show what is ignored (not what matches a pattern but isn't ignored anyway). – Mark Adelsberger Jan 24 '19 at 21:37
  • It would be really appreciated if someone could provide an example of what `` should be replaced with. I tried deleting it and I get no output, I tried using `-type f` and I tried `-name "*"`. Both produce no output when combined with the `-not -exec git check-ignore -q --no-index {} \;` which uses check-ignore on each file I understand that $(git rev-parse --show-top-level) is setting the search path for the find to the root of the working copy, and it can be with `.` if you working directory is already the top-level (or you want to filter lower down in the working copy). – Jason Harrison Mar 08 '22 at 20:33