After playing around with rm
etc etc, my git reflog
no longer shows the past commits.
However, git log --reflog
is still able to show them.
How does git log --reflog
show dangling commits if it doesn't rely on git reflog
?
After playing around with rm
etc etc, my git reflog
no longer shows the past commits.
However, git log --reflog
is still able to show them.
How does git log --reflog
show dangling commits if it doesn't rely on git reflog
?
Reflogs do not hold commit ancestry: reflogs hold name-update history. It's the commit graph that holds commit ancestry. git log
would like some set of starting points, but after getting them, it looks at the commit graph. git log --reflog
normally alters the set of starting points, but if you've removed all the reflogs (or never had any, e.g., in a bare clone), it doesn't: you get the standard single starting point, HEAD
.
Note that reflog entries eventually expire, after 90 days by default,* but the graph never expires. A new clone has no reflog history, but has all the graph linkage—and git log --reflog
still shows multiple commits.
*The default expiry is 30 days for some entries, 90 for most, and for the special reflog for refs/stash
, never. The deetails are out of scope of this answer.
Experimentally:
rm .git/logs/HEAD
removes the HEAD reflog, but git reflog
still shows some data. On the other hand:
rm -r .git/logs
removes all reflogs after which git reflog
shows nothing.
At this point, one might expect git log --reflog
not to find anything. However, apparently this uses the same "add HEAD
itself as a default" behavior. So now that there are no reflogs, git log --reflog
does the equivalent of:
git log HEAD
which shows the current commit and its ancestors. Using:
git log --no-walk --reflog
you'll see just the one commit identified by HEAD
.
This means that the answer to:
How does
log --reflog
work if it doesn't rely on reflog?
is that it does not do anything that plain git log
does not do any more:
When you supply particular starting commits to git log
, Git shows those commits, and commits reachable from those commits, without using HEAD
as a starting point. (Adding --no-walk
makes this clearer.)
When you don't supply any particular starting commit, git log
uses HEAD
as its starting point. (Again, adding --no-walk
makes this clearer.)
(When you do have some reflogs—which is the normal case—the --reflog
argument supplies the reflog values as starting points, which disables the "use HEAD
as starting point" action. If everything now makes sense, you can stop here!)
It's important, when using Git, to know what a commit does for you, vs what a branch name like master
, or a reflog entry like master@{3}
, does for you.
Each Git commit holds a full snapshot of all of your files, but that's not all that it holds. Each commit also holds some metadata. Much of this metadata—information about the commit—is pretty obvious as it shows up in git log
output. This includes the name of whoever made the commit, an email address, and a date-and-time-stamp, along with the log message they provided.
Each commit itself has a unique hash ID, too. This hash ID is, in essence, the "true name" of the commit. It's how Git looks up the actual commit object, in its big database of all commits and other supporting Git objects.
A branch name like master
simply holds the hash ID of one particular commit. This one commit is, by definition, the last commit in the branch. But a commit, such as the last one in the master
branch, also can hold a commit hash ID. Each commit, in its metadata, has a list of hash IDs. These are the parents of the commit.
Most commits have just one parent hash ID. This forms these commits into simple backwards-looking chains. We can draw such a chain like this:
... <-F <-G <-H <-- master
if we use uppercase letters to stand in for commit hash IDs. Here H
is the hash ID of the last commit on master
. Commit H
itself contains, in its metadata, the actual hash ID of earlier commit G
. So given commit H
, Git can use this hash ID to look up commit G
. That in turn supplies the hash ID of commit F
.
Git can, in effect, walk this chain backwards. That's what git log
does, normally. Using --no-walk
tells git log
: show me commits, but do not walk backwards through their chains; show me only the commits I specifically select via the command line. So with --no-walk
, you will see only the commits you selected, and not their ancestry.
Reflogs, like branch names, hold hash IDs. The reflogs are organized into one log per name (branch name, tag name, and so on) plus one log for the special name HEAD
. These are, at least currently, stored in plain files in the .git/logs
directory. Each log has entries—one line per file, in this case—and each entry corresponds to the hash ID that the name resolved-to at some earlier time. You can use these to access the earlier values, so master@{1}
tells Git to use the one-step-earlier value: before the most recent update to the name master
, it resolved to some hash ID; now it resolves to some (probably different) hash ID; we want the one from one step back. The name master@{2}
tells Git that we want the value from two steps back.
Note that these are name-update steps, not commit-backwards-arrow steps. Sometimes master@{1}
is the same as master~1
, master@{2}
is the same as master~2
, and so on—but sometimes these are different. The suffix syntax master~2
or master^2
operates with / on the commit graph. The suffix syntax master@{number}
operates with / on the reflog for master.
(The current value of the branch name, master@{0}
, isn't in the master
reflog because it is in master
itself. Updating master
will take the current value and add it to the log, and then set the new value.)
You can have Git spill out the contents of some or all reflogs using git reflog
. If there are no reflogs at all—which will be the case if you remove them all—nothing will come out here as there are no longer any saved values. However, all the names still have their values, and HEAD
still exists and contains a branch name such as master
.
Note that the way git log
functions, it can only really show one commit at a time. To handle this, it uses a priority queue. You can, for instance, run:
git log <hash1> <hash2> <hash3>
using three actual hashes, or:
git log master develop feature/tall
which uses names to find hash IDs, or:
git log master master@{1} master@{2}
which uses two reflog entries (plus the branch name) to find hash IDs.
In all cases, Git inserts all the hash IDs into a priority queue.
Using --reflog
as a command-line argument tells git log
to take all the values from the reflogs and insert those into the queue.
If nothing goes into the queue, Git inserts the result of resolving HEAD
to a hash ID instead.
At this point, the queue is presumably not empty, because if nothing else, we got a hash ID by resolving the name HEAD
.1
The git log
command now enters a loop, which runs until the queue is empty. This loop works as follows:
git log
to decide whether to display this commit. If so, display the commit. (For instance, git log --grep
selects for display commits whose log message contains the given string or pattern.)--no-walk
is in effect, we're done with this commit. Otherwise, choose some or all of the parents of this commit to put into the queue, based on the --first-parent
flag and any History Simplification selected.(Note that if a commit is now, or ever has been, in the queue, git log
won't put it back into the queue, so you will not see the same commit twice. The priority within the queue is affected by git log
's sorting options.)
So, with --reflog
, we give git log
multiple starting points from the reflog entries, if there are reflogs. If there aren't any reflogs, git log
uses its standard default: start with HEAD
.
Regardless of whether we used --reflog
or not, git log
now walks commits using the parent linkage in the commits themselves. This does not depend on the arguments we supplied, except of course for --no-walk
.2
1If there are no commits at all, or we're on an "unborn branch" created by git checkout --orphan
, the queue would be empty at this point, but git log
will have errored out while trying to resolve the name HEAD
.
2Also, with the -g
or --walk-reflogs
argument, git log
will not walk the commit graph. Instead, it walks the reflog entries.
The difference between --walk-reflogs
and --reflog
is that with --walk-reflogs
, the whole priority queue thing is tossed out entirely: Git looks only at the reflogs. This also changes some output formats. In fact, git reflog
really just runs git log -g
.