Git - how to remove local history entry?

Question

I was experimenting about GIT, staging and committing locally, but for some reason, I've already deleted the test branch, this history with message "INIT" still shows. When I click that, I can see the file I've modified with the commit details in Visual Studio 2019.

I tried:

git rebase -i SHA_value OR git rebase -i HEAD~1 and many other commands, but this "INIT" history is still showing.

What other command should I use?

If you are working on a different branch that was created from the branch you deleted, those revisions are still part of the history of your current branch. — eftshift0, Jul 08 '21 at 00:55
If I create a branch initially, then commit, with message "INIT" then edit a few more files on the same branch, commit with message "new init". I'll have 2 in the local history, INIT and NEW INIT. If I issue git reset HEAD~ twice, those history will be gone, both of them. — Laila Villa, Jul 08 '21 at 02:05
I issued git reset --hard to restore the INIT code in my working directory. The NEW INIT disappeared, but the INIT remained, and I can't remove it. — Laila Villa, Jul 08 '21 at 02:08
What do you want to do? Start from a new branch with no previous history? Try `git checkout --orphan some-branch` — eftshift0, Jul 08 '21 at 04:24

score 0 · Answer 1 · answered Jul 08 '21 at 17:40

It's not entirely clear what your question is, but I think, at the bottom of your puzzlement, you're missing a critical element in how Git works. So this answer is a description of just that: how Git works, in terms of the commit graph.

Git is, in some ways, just a graph-manipulation program. When I say graph here, I mean the computer-science-y graph: a mathematical device defined as G = (V, E). Here G is the graph, V is a set of vertices or nodes—which in Git, exist in the form of commits—and E is the set of edges that connect the vertices together. The Git graph is more specifically a directed graph, and in fact is a Directed Acyclic Graph or DAG. That means the connections only allow you to move one way, like driving on one-way streets in a city.

Now, a graph is only "interesting", in some sense, once it actually has some nodes or vertices. That is, this empty graph:

is pretty boring. Here's a non-empty graph:

  B--C
 /    \
A------D

I think you'll probably agree that this one is more interesting than the empty one. :-)

What does this have to do with Git?

In Git, the nodes in the graph—the A, B, C, etc., that are in the less-boring one above—are the commits. The dashed lines connecting them are the edges. The graph itself, in a sense, does not even exist until there's at least one commit. The very first commit we ever make, in a Git repository, just stands alone, like this:

There are no other commits—which means no other nodes—so there is nowhere for A to connect to.¹ Only once we add a second commit B do we get a connection:

A--B

This still doesn't explain anything, because we haven't yet added the notion of branch names to this.

¹In a non-DAG, A could connect to itself, but the Git commit graph is constrained to be a DAG.

Git branch names

Branch names, in Git, live outside this graph, but have a particular function: Each branch name selects exactly one commit. We say that the branch name points to the commit, and I tend to draw these like this:

     C--D   <-- branch1
    /
A--B   <-- main
    \
     E--F--G   <-- branch2

Here we have a Git repository with exactly seven commits in it. These seven commits connect up, and exist on three branches. The branch names, branch1, branch2, and main, select commits D, G, and B respectively.

These connections are, as I mentioned earlier, one-way only: they go backwards, from newer commits to older ones. Commit G links backwards to earlier commit F, which links backwards to still-earlier commit E, which links backwards to B, which links backwards to A.

What this means in practical terms is that commits A-B-E-F-G are the commits that are "on" branch2. Commit A-B are also on main, and commits A-B-C-D are the commits that are on branch1. Commits, in Git, are often on more than one branch.

Again, note that the names exist outside of the graph. The graph itself is independent of the names. This means we can add more names at any time: for instance, we can add a name branch3 that also points to B, like this:

     C--D   <-- branch1
    /
A--B   <-- main, branch3
    \
     E--F--G   <-- branch2

Commit A and B, which used to be on three branches, are now suddenly on four branches! Or, we can remove some or all names at any time as well, like this:

     C--D   <-- branch1
    /
A--B
    \
     E--F--G   <-- branch2

Commits A and B are no longer on main as main no longer exists at all.

How Git finds commits: how commits and branch names interact

Each of these pieces, by itself, gives you a little bit of insight into how Git works, but it's missing a couple of important items that make it all crystallize. I think the biggest two are these:

Branch names move. Not only are branch names things that move about, but when you make a new commit, Git will automatically move one branch name. Meanwhile, the git reset command lets you move one yourself.
Branch names are how Git finds commits.

Let's take these in the opposite order.

Git uses branch names to find commits

The reason Git does this is that each commit has a random-looking name. The names of real commits aren't simple uppercase letters A, B, C and so on. Their actual names are big ugly hash IDs, like ^{_{670b81a890388c60b7032a4f5b879f2ece8c4558}}. These are unpredictable and impossible for humans to remember. Branch names, on the other hand, are easy for humans to remember: you make up your own branch names, and they have some meaning for you.

For Git to be able to move backwards, from newer commits to older ones, each commit stores, along with the files you need, the raw hash ID of the next-earlier commit. These are the one-way connecting lines between the commits: the edges, or E, in our G = (V, E). This means that Git just needs a quick way to find the last commit in the branch. From there, Git can hop backwards, one commit at a time: commit G points backwards to commit F, which points back to commit E, which points back to commit B, which points back to commit A.

(Commit A, being the very first one, doesn't point back anywhere at all. This lets Git stop going backwards.)

Branch names move

When you have a repository with just one branch and one commit, the one branch name necessarily points to the one commit. There is only one commit—only commit A exists—so all branch names must point to that one commit. (At this point, there's not really very much reason to create a lot of branch names, is there?) So we have:

A   <-- main

Now, Git normally requires that you be on some branch.² You pick the branch to be on, using git checkout or git switch. With just one branch, the branch you're going to be on is that one branch. Still, let's draw in the special name HEAD that Git uses to remember which branch name you're using:

A   <-- main (HEAD)

Git attaches the special name HEAD to one of your branch names. Since we only have the one branch name, that's the one we'll attach it to.

Now let's make a new commit, in the usual way that you already know how to do. This new commit will, internally, get some random-looking hash ID, but we'll call it B and connect it back to existing commit A:

A   <-- main (HEAD)
 \
  B

For one brief moment, just after Git creates new commit B, it really is just kind of floating in space like this, with no name to find it. But before git commit finishes—after it creates B, but before it lets you do your next Git command—Git writes B's real hash ID, whatever that is, into the name main, where HEAD is attached. So now we have:

A
 \
  B   <-- main (HEAD)

and there's no reason to bother putting B on a separate line, so we will just draw this as:

A--B   <-- main (HEAD)

Now, before we go on, let's create a new branch name, branch1, by running:

git branch branch1

This gives us:

A--B   <-- main (HEAD), branch1

Now let's use git checkout or git switch to switch to branch1:

A--B   <-- main, branch1 (HEAD)

Nothing else has changed. We're still using commit B here. But now we're on branch branch1, as git status will say.

Now that we have done this, let's make another new commit, which we'll call C. Git will create new commit C in the usual way, giving it some random-looking hash ID and making it point back to existing commit B:

     C
    /
A--B

This time, though, git commit won't write C's hash ID into main. The special name HEAD is now attached instead to the name branch1. So we get:

     C   <-- branch1 (HEAD)
    /
A--B   <-- main

If we make another new commit D now, we get:

     C--D   <-- branch1 (HEAD)
    /
A--B   <-- main

This looks a lot like our earlier example, with our three branches and seven commits.

Let's run git checkout main or git switch main now, so that main becomes our current branch. When we do this, Git has to change which commit we have checked out too. This is different from last time! Git will remove, from our work area (where we work on files), the versions of the files that go with commit D. It will put in the versions of the files that go with commit B instead. And then we'll be in this state:

     C--D   <-- branch1
    /
A--B   <-- main (HEAD)

We're back to commit B, with the files from that commit. The graph now has four commits and two branch names.

Suppose we now create a new branch name, branch3, pointing to commit B, and switch to it. Git will switch from commit B to ... commit B, again: no real change here! But now we'll have:

     C--D   <-- branch1
    /
A--B   <-- main, branch2 (HEAD)

If we now make three new commits E-F-G, we'll need to draw them on a third line:

     C--D   <-- branch1
    /
A--B   <-- main
    \
     E--F--G   <-- branch2 (HEAD)

²You can be on no branch at all, in what Git calls detached HEAD mode, but this is not a normal way to do new work. Git mostly uses it internally for doing git rebase, though it's also useful in other special situations. For instance, you can use it to look at any old, historical commit, any time you need to do that, for any reason. When you're done with that, you will use git checkout or git switch to get back on a branch.

With all this in mind, we can finally explain `git reset`

More precisely, we can explain git reset --hard. The git reset command is big and full of tricky things and we'll deliberately avoid tackling all of them at once. Just remember that git reset in general is a danger-zone command: it can and will destroy work. If you've committed the work, there are usually ways to get it back, even after git reset (although some of them are somewhat painful). If you haven't committed, though, and you use git reset—especially git reset --hard—you may not be able to get the work back at all. Be careful with this command.

What git reset does—one of the things it does—is move the current branch name. The commits themselves do not change! Let's say we have this as our entire repository: four commits, and two branch names. One of the branch names, main, selects commit A. The remaining name, develop, selects commit D. We're "on" develop, using commit D.

A   <-- main
 \
  B--C--D   <-- develop (HEAD)

If we now run:

git reset --hard <hash-of-C>

this is what happens:

A   <-- main
 \
  B--C   <-- develop (HEAD)
      \
       D

Commit D still exists. It's out there, floating around in space. But there's no name by which Git can find it. The name develop now points to commit C, not commit D.

If you memorized the hash ID of commit D—or wrote it down, or still have it on your screen, or something—you can run git reset --hard hash-of-D and get commit D back easily. If not, and you want commit D back, you need to resort to special help, find some commit I accidentally threw out tricks (these do exist; the commits are usually find-able for at least 30 days, depending on many factors). But if this is a practice repository, or commit D really was useless, you can just let Git eventually expire commit D: without any name by which to find it, Git will really remove it, later, someday, after which it will be gone forever.³

But there's a hitch. No matter where you move the name develop, it must point to some existing commit. Suppose commit A has the commit-message TEST or init 1 or whatever. When you run git log, and look at the commits that are on branch develop will include this first-ever commit.

Commit A exists. The shortest possible chain, starting from some commit and working backwards, is ... commit A. A branch name is required to point to some existing commit. Commit A has the log message it has, and the files that it has, and cannot be changed.

³Unless, that is, you ever sent commit D to some other Git repository, which may hang on to it forever, and might someday send it back to you. Once a commit gets out of your Git repository, usually by you running git push to send a copy somewhere else, it may keep coming back to haunt you. Git is greedy for new commits and will take them from anywhere, adding them to branches, or to remote-tracking names, whenever it possibly can. Git is loth⁴ to give them up: you have to force it, with git reset or similar.

⁴Loth is a less-common spelling of loath; loathe, with an e on the end, is the verb form of this word, with loath-sans-e as the adjective form, but these are really easy to confuse, so I like the "loth" spelling for the adjective. Note that M-W allow loath for both verb and adjective: a lot depends on whether you're a descriptivist or prescriptivist, here.

Root commits are special

Commit A is, in Git's terms (which are ~~stolen~~ borrowed straight from graph theory here) a root commit. That is, there is no commit before commit A. That means that no matter how many commits there are after A, whenever we work backwards, we'll always end at A, and we can't go any further backwards.

New, totally-empty repositories are special too

Remember our empty graph? (If not, scroll back up and look at it.) It has no commits in it. To which commit(s) will you point any branch names you make?

Because there are no commits in this empty graph, there cannot be any branch names either. An empty repository, in Git, has no commits, and therefore has no branches.

Still, when you first create a new empty repository, if you run git status, you will see that you are "on" a branch:

$ mkdir empty
$ cd empty
$ git init
Initialized empty Git repository in .../empty/.git
$ git status
On branch master

No commits yet

nothing to commit (create/copy files and use "git add" to track)

How can you be on a branch that doesn't exist? (For additional proof that it does not exist, run git branch, to list out your branch names.)

Git has a trick here. The non-existent branch that you're on? Git calls this an orphan branch or an unborn branch (Git is not consistent about which one to use here). Git stores the name of this unborn branch in the special name HEAD, but because it's unborn-as-yet, git branch does not show the name, and you can't use git branch -m (the rename option) to change its name.

When you're in this state, the next commit you create—in this case, the very first commit in the repository—causes a new root commit to spring into being. At this time, Git creates the branch name, pointing to that new commit. So if you're on an unborn branch named master, and you make a new commit, that becomes commit A, and the name master now exists and points to it:

A   <-- master (HEAD)

Git does, however, let you get back into this special state at any time, using git checkout --orphan or git switch --orphan. These commands are different in a subtle but crucial way, which I won't cover here as this answer is quite long already. I haven't made my "commit A" yet, so I can use either one right now to rename my unborn branch:

$ git branch -m main
error: refname refs/heads/master not found
fatal: Branch rename failed
$ git checkout --orphan main
Switched to a new branch 'main'
$ git status
On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)

Note how git branch -m failed, but git checkout --orphan worked. I still have no commit yet, but now I'm on the unborn branch named main. Let's make one commit now:

$ echo example > README.md
$ git add README.md
$ git commit -m 'initial commit for example repository'
[main (root-commit) 65b8259] initial commit for example repository
 1 file changed, 1 insertion(+)
 create mode 100644 README.md

Note the (root-commit) notation here. A git log command will now show my one commit, with its one-line summary.

If I now make a second commit:

$ echo text > file
$ git add file
$ git commit -m '2nd example commit'
[main dc02c86] 2nd example commit
 1 file changed, 1 insertion(+)
 create mode 100644 file

Using git log --graph, I can show how the second commit connects back to the first one:

$ git log --graph --format='%h%d %s%n'
* dc02c86 (HEAD -> main) 2nd example commit
| 
* 65b8259 initial commit for example repository

Now let's use git checkout --orphan (in some ways, the git switch variant is "better", but Git 2.23 has not been out long enough for everyone to have it yet, so I will use the older command here) to create a new "orphan" or "unborn" branch. Then I'll make one commit. This will re-use the same files at are in my second commit.

$ git checkout --orphan newroot
Switched to a new branch 'newroot'
$ git status
On branch newroot

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
    new file:   README.md
    new file:   file

$ git commit -m 'new root'
[newroot (root-commit) 2e1518b] new root
 2 files changed, 2 insertions(+)
 create mode 100644 README.md
 create mode 100644 file

Now let's look at the log. We'll need to run with --all to see both branches. Let me run it twice, once without --all first:

$ git log --graph --format='%h%d %s%n'
* 2e1518b (HEAD -> newroot) new root

There's just the one commit on this new branch. It is a second root commit! We saw that in the git status ("no commits yet") and in the result of committing.

$ git log --graph --format='%h%d %s%n' --all
* 2e1518b (HEAD -> newroot) new root
  
* dc02c86 (main) 2nd example commit
| 
* 65b8259 initial commit for example repository

Here, we see how the commit found by main connects back to the initial commit, but the commit found by newroot doesn't connect back to the initial commit.

What I have in this repository, in other words, is this:

A--B   <-- main

C   <-- newroot (HEAD)

I have two disconnected sub-graphs.

Disconnected subgraphs are not particularly useful

This example repository I just made has no real use.

You can put this principle to work to store more than one project in a single Git repository. Or, you can store the documentation in a separate commit graph, for instance. Some people may sometimes find this somewhat useful. So it's not useless. But the documentation for a project probably should be maintained as part of updating the project, so this idea of separating the documentation and source into separate subgraphs isn't a particularly good idea. I would not encourage anyone to do it.

The fact is, this is just a tool. Git itself is just a tool. Use it to do good things. You can use it to do bad things. Don't. But do be aware of how it works. Few abilities, if any, are good or bad on their own: they're just tools.

I appreciate the full detail, but if you look closely at the title of this thread and my images pasted, there is some clear question right about there. All I want is to remove something I have created by issuing a single line of code (or three), I don't need to be an expert to do that by reading this long article, or do I? — Laila Villa, Jul 09 '21 at 01:25
@LailaVilla: you might. It's not at all clear to *me* what it is you want "removed". Commits literally *cannot* be removed. (*Branch names* can be moved or removed, quite easily, though there is the minor hitch that moving the one you're on is different from moving one you're *not* on, and you can't remove the one you're on.) — torek, Jul 09 '21 at 17:32
"It's not at all clear to me what it is you want "removed". It's simply that last commit in that history panel on the first image. But don't worry, ignore this. cheers — Laila Villa, Jul 10 '21 at 03:29
OK, well, you really should read through all of the above, but again: you literally cannot remove a commit. You can stop viewing *some* commits, but you can't stop viewing *any commits at all* on some branch name, because a branch name can only exist if there's at least one commit on it. — torek, Jul 10 '21 at 04:15

Git - how to remove local history entry?

1 Answers1

What does this have to do with Git?

Git branch names

How Git finds commits: how commits and branch names interact

Git uses branch names to find commits

Branch names move

With all this in mind, we can finally explain `git reset`

Root commits are special

New, totally-empty repositories are special too

Disconnected subgraphs are not particularly useful

Linked

Git - how to remove local history entry?

1 Answers1

What does this have to do with Git?

Git branch names

How Git finds commits: how commits and branch names interact

Git uses branch names to find commits

Branch names move

With all this in mind, we can finally explain git reset

Root commits are special

New, totally-empty repositories are special too

Disconnected subgraphs are not particularly useful

Linked

With all this in mind, we can finally explain `git reset`