I'm writing a git filter-branch --tree-filter
command that uses git log --follow
to check if certain files should be kept or deleted during the filtering.
Basically, I want to keep commits that contain a filename, even if this file was renamed and/or moved.
This is the filter I'm running:
git filter-branch --prune-empty --tree-filter '~/preserve.sh' -- --all
This is the command I'm using inside preserve.sh
:
git log --pretty=format:'%H' --name-only --follow --all -- "$f"
The result is that a commit that creates a file that is later moved to another path is stripped out of history when I'm searching for the file in the new path, which shouldn't happen. For example:
commit 1: creates
foo/hello.txt
;commit 2: moves
foo/hello.txt
tobar/hello.txt
;using
git filter-branch
passingbar/hello.txt
yields a history with only commit 2.
At first, I thought the problem was happening because I wasn't using --all
in git log
, that is, when analyzing commit 1 it wouldn't find foo/hello.txt
because it was only looking in past history where bar/hello.txt
isn't mentioned anywhere. But then I added --all
, which looks to all commits (including the "future" ones), however, nothing changed.
I checked out to the commit where the file is being created, ran that log command and it worked (listed both foo/hello.txt
and bar/hello.txt
), so there's nothing wrong with it. I also logged the results of the log command when it's run by filter-branch and in this case I can see that in commit 1 the file is not found (only bar/hello.txt
is listed).
I think this problem happens because internally git is copying each commit to a "new repo" structure so by the time it's analyzing commit 1 the newer commits don't exist yet.
Is there a way to fix this, or another way to approach the problem of re-writing history while preserving renames/moves?
I'm running a modified version of the script found in this answer.