4

I have a single commit at the start of my master branch containing a .gitignore file.

When I run

git filter-branch -f --tree-filter 'git rm .gitignore' --prune-empty

The new tree still contains that node, although empty (the .gitignore file has been removed), so half of the job has been done.

Why did not --prune-empty prune the empty commit? Or did I misunderstand that switch?

Anders Lindén
  • 6,839
  • 11
  • 56
  • 109

1 Answers1

4

2016: The prune-empty option of git filter-branch does mention:

this switch only applies for commits that have one and only one parent

If your modified commit was "at the beginning of the master branch", it has 0 parent.
That particular commit, even empty, will not be pruned.


If your modified commit was "at the beginning of a branch",

beginning of a branch b
     |
     v
--x--Y--z--z
      \
       b--b

it should be pruned only if that empty commit is identical to the previous commit.

As torek mentions here:*

an "empty commit" is really one that has the same tree as the previous commit: it's not that it has no files at all, it's that it has all the same files, with the same modes, and the same contents, as its parent commit.
This is because git stores complete snapshots for each commit, not differences from one commit to the next.

As the doc says:

Some kind of filters will generate empty commits, that left the tree untouched.

So a commit with "0 files" is not an "empty commit" from the point of view of filter-branch, unless the parent commit also has "0 files" (i.e. the same empty "semi-secret" tree)


Note: this has changed with Git 2.13 (Q2 2017): "git filter-branch --prune-empty"(man) drops a single-parent commit that becomes a no-op, but did not drop a root commit whose tree is empty.

See commit 32da746, commit a582a82, commit 4dacc8f, commit 377a354 (23 Feb 2017) by Devin J. Pohly (djpohly).
(Merged by Junio C Hamano -- gitster -- in commit 5296357, 14 Mar 2017)

filter-branch: fix --prune-empty on parentless commits

Signed-off-by: Devin J. Pohly

Previously, the git_commit_non_empty_tree function would always pass any commit with no parents to git-commit-tree, regardless of whether the tree was nonempty.
The new commit would then be recorded in the filter-branch revision map, and subsequent commits which leave the tree untouched would be correctly filtered.

With this change, parentless commits with an empty tree are correctly pruned, and an empty file is recorded in the revision map, signifying that it was rewritten to "no commits." This works naturally with the parent mapping for subsequent commits.

git filter-branch now includes in its man page:

Some filters will generate empty commits that leave the tree untouched. This option instructs git-filter-branch to remove such commits if they have exactly one or zero non-pruned parents; merge commits will therefore remain intact. This option cannot be used together with --commit-filter, though the same effect can be achieved by using the provided git_commit_non_empty_tree function in a commit filter.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • Title changed, added ...at beginning of branch. – Anders Lindén Apr 27 '16 at 09:36
  • 1
    @AndersLindén I have edited the answer to address the case "beginning of a branch", which led me to revisit the notion of "empty commit" – VonC Apr 27 '16 at 11:52
  • It seems like the commit that only had the .gitignore file before the git filter-branch is now empty in all senses. But as you pointed out "this switch only applies for commits that have one and only one parent". Shouldnt the function be changed to "this switch only applies for commits that have one parent or is the initial commit"? – Anders Lindén Apr 27 '16 at 12:21
  • @AndersLindén since the initial commit has 0 parent, it would not be pruned. That is because of the definition of "empty commit" in Git, which actually is a "non-modified tree compared to the parent commit". – VonC Apr 27 '16 at 12:22
  • Yes, but the rephrasing proposal would include the initial commit. – Anders Lindén Apr 27 '16 at 12:23
  • @AndersLindén no, because an "empty commit" in Git, which actually is a "non-modified tree compared to the parent commit". Hence, there cannot be an "empty commit" for an initial commit. – VonC Apr 27 '16 at 12:24
  • I think an imaginary commit (before the initial commit) should be added to the git vocabulary. If you check out that commit, it will be like checking out orphan, but you do not have to create a new branch. Then initial commit/s could be seen as empty as well. – Anders Lindén Apr 27 '16 at 13:01
  • Has this behavior changed in newer git versions? With version 2.25, I ran a `filter-branch --index-filter` removing some files, and that empty initial commit is gone. I noticed because the same operation 3 years ago had left it. The manpage now says “remove such commits if they have exactly one or zero non-pruned parents”. – PlasmaBinturong Aug 09 '23 at 17:24
  • @PlasmaBinturong Yes, It has changed with Git 2.13 (Q2 2017). I have edited the answer to document that. – VonC Aug 09 '23 at 18:09