I think this is more explainable with pictures, although my ASCII-art ability runs out after just a few --amends
. The trick is to realize that what git commit --amend
does, vs git commit
without --amend
, is to change the parent hash stored in the new commit.
A normal git commit
freezes the index contents into a tree and makes the new commit using the new tree, you as the author and committer, your log message, and the current commit as the parent of the new commit:
...--F--G <-- [you were here]
\
H <-- branch (HEAD) [you are here now]
Then we make new commit I
, after straightening out the drawing:
...--F--G--H--I <-- branch (HEAD)
and new commit J:
...--F--G--H--I--J <-- branch (HEAD)
and so on.
Using git commit --amend
, however, we make the new commit H
as usual except that the parent of H
is F
instead of G
:
G [you were here]
/
...--F--H <-- branch (HEAD)
Then, making I
, we make it as usual except that the parent of I
is F
instead of H
:
G [you were here]
/
...--F--H [you were here too]
\
I <-- branch (HEAD)
and so on.
If you imagine running git log
right at this point—with commit I
pointing back to commit F
—you will see commit I
, then commit F
, then E
, and so on. Commits G
and H
will be invisible to gitk
or git log
(but will show up in git reflog
output, since that does not follow parent chains).
After 100 such operations, I've run out of commit letters and cannot possibly draw the crazy fan of commits all pointing back to F
, but you can imagine them; or I can draw just 9 such commits, G
through O
:
GH
| I
|/ J
...--F==K
|\ L
| M
ON
Each of these various commits has the tree and the message that you want. What is wrong with each of these commits, except for G
itself, is that it has the wrong parent: you'd like G
to have F
as its parent (which it does), but then you would like H
to have G
as its parent (which it does not).
This means you must copy the wrong commits. Let's start by copying H
to H'
that has G
as its parent, but otherwise uses the same tree and message and other metadata as H
:
H'
/
GH
| I
|/ J
...--F==K
|\ L
| M
ON
Now we need to copy I
to a new commit I'
. The parent of I'
is not G
but rather H'
:
H'-I'
/
GH
| I
|/ J
...--F==K
|\ L
| M
ON
We repeat for J
to J'
, using I'
as the parent for J'
, and so on until we have copied every "wrong" commit to a "right" one. Then we can set a branch name to point to the last such copied commit:
H'-I'-J'-K'-L'-M'-N'-O' <-- repaired
/
GH
| I
|/ J
...--F==K
|\ L
| M
ON
Running git log
while on repaired
, or gitk --all
, will now show commit N'
leading back to M'
leading back to L'
and so on. Remember that git log
(and gitk
) follow the parent linkages backwards, without looking at the reflog at all.
If you're willing to let some of the metadata (author and committer name, email, and timestamp) be clobbered, it's easy to make each of these commits with a shell script loop using git commit-tree
. if you want to preserve that metadata, it's harder: you need to set a series of Git environment variables before calling git commit-tree
each time, setting:
GIT_AUTHOR_NAME
, GIT_AUTHOR_EMAIL
, GIT_AUTHOR_DATE
: for the author
GIT_COMMITTER_NAME
, GIT_COMMITTER_EMAIL
, GIT_COMMITTER_DATE
: similar but for the committer
The log message can be copied directly from the incorrect commits, using sed
or similar to chop off everything up to and including the first blank line (note that this discards any i18n encoding data, and sed
may behave badly with unterminated final text lines in commit messages, but these may be tolerable; if not, extract the relevant code from git filter-branch
).
Use git reflog
to obtain the hash IDs of all commits to be copied, in the correct order (oldest-to-copy first, last-to-copy = newest last). Place these in a file, one entry per line. Then the following untested shell script will probably suffice:
parent=$hash # hash ID of commit G, onto which new chain will be built
while read tocopy; do
tree=$(git rev-parse $tocopy^{tree})
parent=$(git cat-file -p $tocopy | sed '1,/^$/d' |
git commit-tree -p $parent -t $tree)
done < $file_of_commits
git branch repaired $parent
This creates a new branch name repaired
to hold the newly built chain.