As matt said, you can't actually change a commit. What git commit --amend
does is make a new commit that doesn't extend the current branch.
I think more people get this when they can see, visually, what this means. But this means you first need to learn to draw commit graphs. Since you are using an IDE, perhaps you have already learned this—but you don't mention which IDE (and I don't normally use any IDE anyway myself). Also, it's very hard to type graphics into a StackOverflow posting. :-) So consider this text version of a drawing of a commit graph:
... <-F <-G <-H
What we have here is a series of commits, all in a nice neat line, with the commit whose hash is H
—H
here stands in for some big ugly Git hash ID—being last in the chain. Some IDEs draw this vertically, with commit H
at the top:
H <commit subject line>
|
G <commit subject line>
|
:
but for StackOverflow purposes I like the horizontal drawings.
Note how each commit "points back" to its immediate parent commit. Internally, this means that each commit stores the full hash ID of its parent, because Git actually finds commits by their hash IDs.
To find commit H
specifically (and quickly), Git needs to have its hash ID stored somewhere too. Git can use H
to find G
's hash ID, and then can use G
to find F
's hash ID, and so on, but Git needs to start with H
's hash ID. The place Git finds H
's hash ID is your branch name:
...--F--G--H <-- branch (HEAD)
The branch name lets Git find H
easily, because the branch name itself contains the raw commit hash ID. Then commit H
lets Git find G
easily, which lets Git find F
easily, and so on. That's our backwards-looking chain of commits.
Note how when you have multiple branches, they tend to rejoin somewhere in the past:
I--J <-- branch1
/
...--G--H <-- master
\
K--L <-- branch2 (HEAD)
Here, we made branch1
have two commits that aren't on master
yet, and made branch2
have two commits that aren't on master
yet either. The commits up through H
are on all three branches. The name master
specifically means commit H
, but also—whenever necessary—means all commits up to and including H
. The name branch1
means commits up to and including J
, or just J
, depending on context; and the name branch2
means commits up through H
plus K-L
, or just L
, depending on context.
I've drawn the name HEAD
in here attached to branch2
to show that we have branch2
, and hence commit L
, checked out at the moment. The general idea with Git is this:
- Git will resolve
HEAD
to a commit, by finding the branch name and following the branch name's arrow to the commit.
- That's the current commit and is the one you have checked out.
- Each commit finds a previous (parent) commit, which defines the chain of commits that we call "the branch".
- But the word branch is ambiguous, because sometimes it means the branch name, sometimes it means the commit at the tip of the branch as pointed to by the name, and sometimes it means the tip commit plus some or all previous commits.
When people use the word branch, sometimes they mean a series of commits up to and including a commit that's not identified by a branch name, too. Besides regular branch names, Git has tag names, and remote-tracking names like origin/master
and origin/branch1
and so on. All of these names ultimately just point to one specific commit—just like a branch name—but only branch names have the feature of letting HEAD
attach to them.
Consider now how adding a commit normally works
Suppose we have:
...--G--H <-- branch (HEAD)
We have the files from commit H
checked out, because H
is the current commit. We make some changes to those files, git add
the updated files to copy the changes over top of the copies of those files in Git's index or staging area, and run git commit
. Git now freezes the files that are in its index, add the right metadata—our name and email address, our log message, the parent hash ID H
, and so on—and thereby creates a new commit, which gets a new unique big ugly hash ID but we'll just call it I
:
...--G--H <-- branch (HEAD)
\
I
Since I
is now the last commit, Git now does the one trick that makes branch names different from any other kind of name: Git stores I
's hash ID into the name branch
, because that's the name to which HEAD
is attached. We get:
...--G--H
\
I <-- branch (HEAD)
which we can straighten out now:
...--G--H--I <-- branch (HEAD)
So this is how git commit
works:
- It freezes, for all time, all the files that are in Git's index (aka the staging area). The fact that Git makes commits from its ready-to-freeze copies in its staging area, rather than the ordinary files you have in your work-tree, is why you have to
git add
files all the time. These frozen files become the new snapshot in the new commit.
- It adds the appropriate metadata: your name, your email address, the current date-and-time, and so forth.
- It sets the new commit's parent hash ID to the current commit's hash ID.
- It writes out the actual commit (which gains a new, unique hash ID).
- Last, it writes the new commit's hash ID into the current branch name so that the name continues to point to the last commit in the chain.
Having written out the new commit, the current commit changes—the hash ID that HEAD
means is now commit I
instead of commit H
—but once again the current commit snapshot matches the files in Git's index, which—if you git add
-ed everything—also match the files in your work-tree.
Now we can see how git commit --amend
works
When you use git commit --amend
, Git goes through all the same steps as for any commit, with one exception: the new commit's parent (or parents, plural, if the current commit is a merge commit) are taken from the current commit instead of being the current commit. That is, rather than doing:
...--G--H--I--J <-- branch (HEAD)
with new commit J
pointing back to existing commit I
, Git does this:
I ???
/
...--G--H--J <-- branch (HEAD)
In effect, the then-current commit I
has now been "shoved out of the way" to place the new commit at the end of the chain without making the chain any longer.
The existing commit I
still exists. It just no longer has a name.
When you involve another Git repository, you exchange commits with them
Git is, at its heart, really all about commits. You make new commits, and then you have your Git call up another Git and send it your commits. Or, you call up that other Git repository—whether or not you yourself have made any new commits—and get any new commits that they have, that you don't.
One of these two commands is git fetch
. That's the one that calls up their Git and finds which commits they have that you don't: it fetches their commits into your own Git repository.
The other command is git push
: with git push
you have your Git call up their Git and send commits. These two are not quite symmetric, though. Let's look at git fetch
first, because it's where remote-tracking names like origin/master
and origin/branch
come from.
We've already seen that Git finds commits by taking a name—maybe a branch name—to find the last commit, and then working backwards. Meanwhile, your Git is calling up some other Git. That other Git has branch names B1, B2, B3, ..., each of which specify the last commit's hash ID, for that Git's branches.
Those are their branches, not your branches. You may or may not have branches with the same name, but those are their names, pointing to their last commits. Your git fetch
doesn't touch your branch names.
Suppose, for instance, that we start with:
...--G--H <-- master (HEAD)
but that we got commit H
from origin
. Then we really have:
...--G--H <-- master (HEAD), origin/master
That is, in their Git repository, their name master
also selects commit H
. So our Git has recorded their Git's name master
as our origin/master
; then we made our master
from their origin/master
, and now both names point to existing commit H
.
If we now make our own new commit I
, our master now points to commit I
. Our origin/master
still points to H
as before:
...--G--H <-- origin/master
\
I <-- master (HEAD)
Meanwhile, suppose that they—whoever they are—make their own new commit. It will get some big ugly unique hash ID; we'll just call it J
. Their commit J
is in their repository:
...--G--H--J <-- master [in their Git]
We run git fetch
, and our Git calls up their Git and finds that they have a new commit that we have never seen before. Our Git gets it from their Git and puts it into our repository. To remember J
's hash ID, our Git updates our own origin/master
:
I <-- master (HEAD)
/
...--G--H--J <-- origin/master
(I put ours up top just for aesthetics—I like the letters to be more in alphabetical order here).
Now we have a problem, of sorts. Our commit I
and their commit J
form two branches, depending on just what we mean by the word branch:
I <-- master (HEAD)
/
...--G--H
\
J <-- origin/master
We'll need to combine these somehow, at some point. We can do that with git merge
, or we can use git rebase
to copy our existing commit I
to a new and improved commit—let's call it I'
—that extends their J
:
I ??? [abandoned]
/
...--G--H--J <-- origin/master
\
I' <-- master (HEAD)
We abandon our I
in favor of our new-and-improved I'
, which adds on to their existing commits. We can now git push origin master
. Or, we use git merge
to combine work into a new commit, with a snapshot made by a slightly complicated process involving comparing commit H
's snapshot to each of the two snapshots in I
and J
:
I
/ \
...--G--H M <-- master (HEAD)
\ /
J <-- origin/master
Once again we can now git push origin master
.
Why push is not symmetric with fetch
Let's say we have just this:
I <-- master (HEAD)
/
...--G--H
\
J <-- origin/master
In other words, we have not yet rebased or merged. Our name master
points to commit I
; our name origin/master
, representing the master
over on origin
, points to commit J
. We can try to run:
git push origin master
which will call up their Git, send them our commit I
—they don't have it yet because we have not given it to them before—and then ask them to set their master
to point to commit I
.
Remember that their master
currently points to (shared, copied into both Gits) commit J
. If they do what we ask, they will end up with:
I <-- master
/
...--G--H
\
J ??? [abandoned]
That is, they will lose commit J
entirely. Git finds commits by starting from a branch name like master
and working backwards. Their master
used to find J
; and if they take our request, to set their master
to point to I
instead, they won't be able to find J
any more.
This is why they just refuse our polite request, saying not a fast forward. We fix this problem by using git rebase
or git merge
, to make I'
or some merge commit. Then we either send them I'
and ask them to set their master
to point to I'
, which is OK because I'
comes after J
and therefore keeps commit J
in the picture; or, we send them M
(and I
again, if they dropped it), and ask them to set their master
to point to M
, which is OK because both I
and J
come before M
, so that they can still find J
.
Sometimes we really want them to throw out a commit
When we use git commit --amend
, we take a chain like this:
...--H--I <-- branch (HEAD)
and turn it into this:
I ??? [abandoned]
/
...--H--J <-- branch (HEAD)
which makes commit I
appear to go away. It actually sticks around for a while—at least a month or so—in case we want it back, through a mechanism that Git calls reflogs. But it's gone from the everyday view, as there's no name that points directly to it, and no other name that points to some commit that eventually points back to I
either.
But what if we sent commit I
to some other Git? What if, in particular, we ran:
git push origin branch
so that we now have:
I <-- origin/branch
/
...--H--J <-- branch (HEAD)
where our origin/branch
represents origin
's branch
, which now points to our old commit I
?
If we just run:
git push origin branch
this tells their Git: Here: have a new commit J
. Now please, if it's OK, set your branch
to remember commit J
. They will say no, for the same reason they said no to our other example: this will lose commit I
, in their Git repository.
But that's exactly what we want. We want them to lose commit I
off their branch branch
. To make that happen, we send the same sort of operation—another git push
—but we change our last polite request, into a more forceful command.
We have two options:
We can say: Set your name branch
to point to commit J
! This just tells them drop all commits that might be dropped this way, even if that's now I
and K
too.
Or, we can say: I think your branch
identifies commit <hash-of-I
>. If so, change it to identify commit J
instead. In any case, let me know what happened.
The first is a simple git push --force
. The second is git push --force-with-lease
. Our Git will fill in the "I think" commit I
hash part from our origin/branch
, and of course get commit J
's hash ID in the same way as always.
The danger of any git push --force
, with or without the -with-lease
part, is that we're telling some other Git throw out some commits. That's what we want, of course, so it's not that dangerous, as long as we know that we're asking for commits to be thrown out. But if we're git push
-ing to a GitHub repository, are there other people who use that GitHub repository to git fetch
from? Maybe they have picked up our commit I
and are using it. They could put commit I
back. Or, maybe we are making extra work for them, such that they'll have to rework their commits to use commit J
instead of commit I
.
We should arrange in advance with other users of this origin
Git, so that they know which branches might have commits removed like this.
Your own case
In your case, you did a git push
that failed, then a git pull
. The git pull
command means run git fetch
, then run a second Git command. That second command is git merge
by default.
So, let's say you started with:
...--G--H <-- master, origin/master, branch (HEAD)
then added commit I
:
...--G--H <-- master, origin/master
\
I <-- branch (HEAD)
You then ran (successfully) git push -u origin branch
which resulted in:
I <-- branch (HEAD), origin/branch
/
...--G--H <-- master, origin/master
(again I just put I
on top this time for aesthetics).
Next, you used git commit --amend
, which made a new commit J
that doesn't have I
as its parent:
I <-- origin/branch
/
...--G--H <-- master, origin/master
\
J <-- branch (HEAD)
You tried a regular git push
, which failed with not a fast forward: their Git told your Git that this push would lose commits (I
in particular).
Then you ran git pull
:
- This ran
git fetch
, which did nothing because you already have commits H
and I
and there are no changes to make to any of your origin/*
names.
- Then it ran
git merge
to merge I
and J
into a new merge commit.
I'll stop drawing the names master
and origin/master
as they get in the way, but this did just what we'd now expect:
I <-- origin/branch
/ \
...--G--H M <-- branch (HEAD)
\ /
J
and then you ran git push
, which sent them commits J
and M
to add on to their branch
. They said OK to that, so your Git updated your origin/branch
:
I
/ \
...--G--H M <-- branch (HEAD), origin/branch
\ /
J
and this is what you now see in your repository.
You can, if you like, force your name branch
to point to commit J
directly again, then use git push --force-with-lease
to ask the other Git to discard both commits M
and I
.
To force your current (HEAD
) branch to point to one specific commit, use git reset
. In this case, you might first make sure you have nothing else that git reset --hard
will destroy, and use git reset --hard HEAD~1
to move to the first parent of M
. See side note on first parent below.
(To move a branch that you're not on, use git branch -f
, which needs two arguments: the branch name, and the commit to move-to. Since git reset
operations on the branch that you are on, git reset
just takes the commit specifier.)
Side note: --first-parent
There's a tricky bit that is not shown well in my horizontal graph drawings. Whenever you make a new merge commit like M
, Git makes sure that the first of the multiple parents coming out of M
points back to the commit that was the tip of your branch before. In this case, that means the first parent of M
is J
, not I
.
You can have git log
, and other Git commands, only look at the first parent of each merge when viewing commits. If you do this, the picture looks like this:
...--G--H--J--M <-- branch (HEAD), origin/branch
In fact, M
still points back to I
too, as its second parent.
This --first-parent
option is mainly useful to look at a branch like master
when features are always developed on their own branches:
o--o--o <-- feature2
/ \
...--●---------●---------●--... <-- master
\ /
o--o--o <-- feature1
Looking at master
with --first-parent
drops all those incoming side connections, so that one sees only the solid bullet commits. But the notion itself matters whenever you are dealing with a merge commit: M^1
means the first parent of M
and M^2
means the second parent of M
. The tilde suffix notation counts backwards through first-parent links only, so that M~1
means step back one first-parent link.