phd's answer is correct but deserves some expansion.
If you look at the current documentation for git commit
(this evolves over time) it should include the --include
and --only
options:
-i
--include
Before making a commit out of staged contents so far, stage the contents of paths given on the command line as well. This is usually not what you want unless you are concluding a conflicted merge.
-o
--only
Make a commit by taking the updated working tree contents of the paths specified on the command line, disregarding any contents that have been staged for other paths. This is the default mode of operation of git commit if any paths are given on the command line, in which case this option can be omitted. [snip]
As that last-quoted sentence says, the default action, when adding path names to your git commit
command, is to behave as git commit --only
. This particular action is achieved in a remarkably complex fashion, which can confound some pre-commit hooks.
The --include
behavior is easier to describe, though this easy/simple description is slightly flawed (see below for a thorough and correct description). Using git commit --include
with:
$ git add file1.txt
$ git commit --include file2.txt
for instance is essentially equivalent to doing:
$ git add file1.txt
$ git add file2.txt
$ git commit
That is, the --include
simply runs git add
for you, though with the complication that if the commit fails, these files are magically "un-added".
The --only
option, however, is much more complicated. There's no simple way to describe it that is also fully correct. To describe both of these properly, we need to understand Git's index in some detail.
Technical details: the index
When Git makes a new commit, it always1 does so from an index or staging area or cache. These are three words for the same thing. The index / staging-area / cache is the way Git keeps track of what you would like committed. Except for the special case of a conflicted merge,2 the index holds your proposed next commit. When you first git checkout
or git switch
to some commit, Git fills in its index from that commit. So your proposed next commit matches your current commit.
You may have noticed here that I sometimes say the index, or Git's index, as if there is exactly one index, but I also sometimes say an index, as if there can be more than one. The tricky part here is that both are correct: there is one particular distinguished index—the index—but you can have more than one.
Technically, the distinguished index is per-work-tree: if you use git worktree add
, you not only add another working tree, but also another distinguished index, for that particular working tree. You can find the file name of the distinguished index with:
git rev-parse --git-path index
which normally prints .git/index
, but in an added work-tree, prints something else. If $GIT_INDEX_FILE
is set in the environment, it prints this variable's value. This is how Git swaps to some alternate index file—or more precisely, it's the externally available mechanism that you can use, to point Git to some alternate index file, and a way for a pre-commit hook to detect a git commit --only
invocation, for instance.
When you run git add
, Git finds the existing entry, in the index, for the file you're git add
-ing:
If there is no existing entry—if this is a new file—Git copies the file into Git's index and now there is an existing entry: your proposed new commit has a newly added file, as compared to the current commit.
Otherwise, there is some existing file in Git's index. Git boots this file out of its index, and copies the work-tree version of the file into its index. If this copy of the file is different from the copy in the current commit, git status
will now say that the file is staged for commit.
So, git add
simply updates your proposed next commit, which—at all times (but see footnote 2)—holds a copy of every file that Git will snapshot. The copy that's in the index is the one git commit
will use.
Now that we know how the index works, and that Git can use some extra, temporary index files that we can create, now we can really see how git commit --include
and git commit --only
work.
1This is correct for git commit
, but if you use git commit-tree
you can bypass the need for the index. You must supply, to git commit-tree
, the hash ID of the tree. Where will you get that tree? If you use git write-tree
, that uses the index. But you can get a tree from somewhere else, by, e.g., just using some existing tree, or using git mktree
. Note, however, that with git mktree
you can build incorrect trees; the resulting commit will be impossible to check out.
2During a conflicted merge, Git expands the index. This expanded index cannot be written out: git write-tree
complains and aborts. Using git add
or git rm
, you replace the expanded index entries with normal entries, or remove some entries entirely. Once there are no expanded, non-zero-stage entries left, the conflicts are all resolved, because git write-tree
can now write out the index: committing becomes possible again.
Technical details: --include
and --only
To implement git commit --include
, Git does this, more or less:
- copies the index to a temporary one ("an" index);
- runs
git add
on the temporary index, with the files you're include-ing;
- attempts the commit.
An attempted commit can succeed—creating a new commit and updating the current branch name—or it can fail. The commit fails, for instance, if git commit
runs your editor and then you choose to delete the entire commit message. Perhaps you were looking at something and realized you shouldn't commit yet, so you did that. Or, the commit fails if the pre-commit hook decides that this commit is not ready yet. Note that the pre-commit hook should look at the temporary index here! It should not look at the files in your working tree. That's not necessarily what will be in the commit. Your proposed next commit is now whatever is in the temporary index.
If the commit fails, Git simply removes the temporary index. The original index—the index—is untouched, so everything is now back the way it was. The git add
s in step 2 are magically undone.
If the commit succeeds, Git simply replaces the index with the temporary index. Now the index and the current commit—which is the one we just made—match, so that nothing is "staged for commit". That's how we like it.
Implementing git commit --only
is harder. There are still two cases: the commit can fail, or the commit can succeed. For the "fail" case, we want to have the same thing happen as for git commit --include
: the index, the main distinguished one, is undisturbed, as if we didn't even attempt to run git commit
. But, for the success case, git commit --only
is tricky (and the documentation is, I think, slightly inadequate).
Suppose we do this:
$ git checkout somebranch # extract a commit that has files
$ echo new file > newfile.txt # create an all-new file
$ git add newfile.txt # stage the all-new file (copy into index)
$ echo mod 1 >> file.txt # append a line to an existing file
$ git add file.txt # stage the updated file (copy into index)
$ echo mod 2 >> file.txt # append *another* line to the file
$ git commit --only file.txt -m "test"
What would we like as the outcome, if this succeeds? We told Git to commit the two-line addition. Our working tree copy of the file is the two-added-lines version. Should the staged file, proposed for next commit after our test commit, have just the one added line? Or should it have both added lines?
Git's answer to this question is that it should have both added lines. That is, if the git commit
works, git status
should now say nothing about file.txt
; it should only say that newfile.txt
is a new file. The two-added-lines version of the file must therefore be the one in the proposed next commit, at this point. (You might agree with Git, or disagree with it, but that's what the Git authors chose to have as the result.)
What this means is that we need three versions of the index at the point of git commit --only
attempting to make the commit:3
- One—the original index—will have the new file in it, and the one added line.
- One—the index to be used by
git commit
to make the new commit—will not have the new file in it, but will have the two added lines to file.txt
.
- The last one will have the new file in it, and the two added lines to
file.txt
in it.
The middle one of these three is the one git commit
will use when attempting to make the new commit. That has the two added lines, but not the new file: it's the git commit --only
action, in action.
If the commit fails, git commit
simply removes both of the temporary index files, leaving the original index—the index—undisturbed. We now have one added line in the proposed next commit's version of file.txt
, and we have the newly added file in the proposed next commit as well, as if we never ran git commit --only file.txt
at all.
If the commit succeeds, git commit
makes the last index—which has both the newly added file, and the two-added-lines version of file.txt
—become the (main / distinguished) index. The original index and the temporary index used for doing the commit both get removed.
This is what makes git commit --only
so complicated. Suppose you're writing a pre-commit hook yourself, and in this pre-commit hook, you plan to do two things:
- Use a linter to make sure that there are no obvious bugs in any of the code that is to be committed (
pylint
, pep8
, go vet
, etc.).
- Use a formatter to make sure that the code conforms to the project's standard (
black
, go fmt
, etc.).
(In my opinion, step 2 is a mistake: don't do it. But others like the idea.)
We now have three cases:
You're doing a normal git commit
. $GIT_INDEX_FILE
is not set. There's just one index to worry about. You read the files out of the (normal, everyday, standard) index, into a temporary directory, and lint them there. If the linting fails, you stop and reject the commit. If the linting succeeds, you format the files and git add
them back to the (single) index, and let the commit happen.
There's still a big problem here because the files that just got committed are the ones that were staged, not the ones in the user's working tree. You can, perhaps, check the working tree files against the pre-updated, not-yet-formatted ones in the index, before git add
ing any formatting updates. If the working tree files match the index copies, it might be safe to reformat the working tree copies here too.
You're doing a git commit --include
. There are two index files to worry about, but for linting purposes, you simply read the ones out of the index file that Git is using now for this commit, which is in $GIT_INDEX_FILE
(which generally names .git/index.lock
at this point).4
You can treat this as before, because any formatting you do will go into the proposed commit, and it's just as safe to wreck the user's working tree files as last time. You've already rejected the commit (and not done any formatting, presumably) if you're going to reject the commit; and if the commit succeeds, as you think it will, the user's --include
files should be formatted too, after all. On success, any updates you make to the temporary index will be in the real index, because the temporary index becomes the real index.
You're doing a git commit --only
. There are now three index files to worry about. One of them—the one git commit
is going to use—is in $GIT_INDEX_FILE
. One of them—the one git commit
plans to use to replace the main / distinguished index is in a file whose name you don't know. The third one, the one that Git will drop back to on failure, is the standard main index.
You can do your checking as usual: lint / vet the files that are in $GIT_INDEX_FILE
. That's the commit the user is proposing to make, after all.
But now, if you format those files and add them to $GIT_INDEX_FILE
... well, the formatted files are the ones that will get committed. That's all well and good. But you also need to git add
those formatted files to the temporary index file whose name you don't know! And, when it comes to checking the working tree files against some index copies, you probably should use the copies that are in that temporary index file whose name you don't know.
If you don't change any files, but simply lint / vet them all and check for the desired formatting, these problems go away. So it's best to just check stuff. If the user wants the working tree files formatted according to the project's rules, provide them with a working-tree-files-formatter. Let them run that, and then let them run git add
on the updated files (or, if you really must, offer to add back the formatted files in the formatting script).
I've worked with a project where the pre-commit hook did check and then, if the formatting was wrong, checked $GIT_INDEX_FILE
and would stop and do nothing for the tough cases, or offer to git add
the reformatted files. That too is an option, but it's a little bit risky since it's possible that Git will change some behavior, making the $GIT_INDEX_FILE
test fail somehow.
3There are no doubt other ways to achieve the desired result, but given the fact that the index file is actually a file (at least initially), plus a bunch of existing Git code, this three-index-files trick was the one to use.
4This was the case the last time I tested all this, but that was quite a while ago—before the existence of git worktree add
, which will clearly affect this, at least for added work-trees.