how to resolve git conflict where i want to keep a change and reject the file deletion in latest branch

Question

CONFLICT (modify/delete): src/a.js deleted in aa1e4d and modified in HEAD. Version HEAD of src/a.js left in tree. I want to keep what is there in the HEAD and reject the deletion in aa1e4d

score 1 · Accepted Answer · answered Nov 19 '20 at 20:42

[Git said]
CONFLICT (modify/delete): src/a.js deleted in aa1e4d and modified
in HEAD. Version HEAD of src/a.js left in tree.
I want to keep what is there in the HEAD and reject the deletion in aa1e4d

Git has already done that—or more precisely, done that in your working tree. All you need to do now, then, is to tell Git that this is in fact the correct resolution:

git add src/a.js

Once you've told Git about all resolutions that are necessary, you can finish whatever operation you are doing (git merge, git rebase, or whatever it may be—all of these invoke Git's merge machinery). In a modern version of Git, run git merge --continue, git rebase --continue, or whatever it was you were doing with --continue to tell Git to proceed further with the suspended operation.

Understanding what is going on

It's important that you understand what you are doing here: you will need this kind of information for future merges, rebases, and so on.

In Git, a merge operation—what I like to call merge as a verb—is any of various Git commands that take three versions of some set of input files, and use those three versions to come up with a single version of each file. A common source of this kind of merge, and one that is easy to understand—well, easier at least—is the git merge command itself, often as run by git pull.

Merge has three inputs

What are these three inputs? Well, remember that Git is really all about commits. It's not about branches, although branch names help you find commits. It's not even about files, although commits contain files. Git is all about commits. So the three inputs are three specific commits.

When we use git merge, we pick two of these commits ourselves. Let's draw a pair of branches, each of which has some commits private to that particular branch and some commits in common (shared across both branches). We'll put older commits on the left, and newer commits on the right, and instead of actual commit hash IDs (like aa1e4d) we'll just use a single uppercase letter to stand in for each commit:

          I--J   <-- branch1 (HEAD)
         /
...--G--H
         \
          K--L   <-- branch2

Here, we're on branch branch1, and we're going to run git merge branch2 to merge commit L with commit J.

If all goes well, in the end, we will produce a new merge commit. This uses the same word, merge, but as an adjective modifying the word commit. Git also calls the result a merge, using the word merge as a noun. The result looks like this, assuming we get a result and keep it:

          I--J
         /    \
...--G--H      M   <-- branch1 (HEAD)
         \    /
          K--L   <-- branch2

This new merge commit M is only on branch branch1, but has, as its two parents (instead of the usual single parent) two of the three input commits, namely J—the commit we were using when we started—and L, the commit we told Git to merge.

Like any commit, the new merge M will hold a full snapshot of every file that Git knows about. The files that Git knows about will be those that came from the three input commits. Those three input commits each hold a full snapshot of every file that Git knew about at the time you, or whoever, made those commits.

I keep mentioning three commits. What are the three commits here? We only showed two commits so far as inputs: J, our current or HEAD commit from branch1, and L, the last commit on branch2 as branch2 now stands. The third commit is the best common ancestor.

Note that if we start at commit J and work backwards, we will find commit I, then commit H, then commit G, and so on. Meanwhile, if we start at commit L and work backwards, we will find commit K, then commit H, then commit G and so on. Our two paths join up at commit H, where the two branches branch off from a common ancestry.

This is how Git works. It's like a big genealogy or family-tree: commit J has commit I as its (one, single) parent. Commit I has commit H as its parent. Commit H has commit G as its parent. We work backwards, back through time, starting from the latest commit as our current or HEAD commit. Most commits have just one parent, but occasionally, we hit a merge commit like M, where there are two parents. (We won't worry about this right now, but when we do hit a merge as we work backwards, we usually have to follow both parents.)

Anyway, looking at the diagram, it's pretty obvious that both G and H are candidates to use as a shared common ancestor. The best one is defined as commit H in this case. For simple graphs like this one, the best one is obvious: it's the latest, and H is newer than G, so that's the best common, shared commit. The technical term for this is the merge base.

How merge works

To perform a merge, git merge will:

compare the snapshot in the merge base commit with our current, or HEAD commit, to see what we changed; and
compare the same merge base snapshot with their commit, to see what they changed.

Since each of these three commits has a set of files in its snapshot, those files are the ones that get compared.

In your particular case, the merge base has a file named src/a.js.¹ Your own commit also has a src/a.js file, and you have changed that file so that it does not match the merge base copy of src/a.js:

modified in HEAD

What they did, in their commit, though, was to leave out src/a.js entirely. Hence when comparing the merge base src/a.js to their commit, they have deleted the file:

... deleted in aa1e4d and ...

The job for git merge is to combine their changes with your changes. If you'd changed line 42 of the file, and they had not, Git would take your change to line 42. If they had added three lines at the front of the file, Git would take their change to keep the extra three lines (so that your change would be on line 45 now). But they did not just modify the file. They deleted the file entirely.

Git is not sure how to combine these correctly, so it picks something to do and does that, but then makes sure to stop and get help from you. It tells you what it picked to do:

Version HEAD of src/a.js left in tree.

The left in tree part is important here.

¹Note that the file name has an embedded slash in it. That's how Git names files: these aren't folders-and-files, they're just file names with slashes. There is some ongoing work in Git to make it recognize renames better, and Git now does understand, in a limited way, that these represent folders-and-files and that a folder-rename causes a lot of file-renames when doing merges. But it's still kind of messy and ad-hoc.

When things go wrong, `git merge` leaves a lot of parts behind

Again, there are three inputs to a merge. Those three inputs are commits, so each one has a lot of files. The way Git manages these files is to put all three copies of each file into what Git calls its index or staging area.

The index, or staging area—which sometimes has a third name, the cache, although these days that third name is almost gone: you mostly see that as a flag now, as in git rm --cached—is how Git knows about files. I mentioned earlier that each commit has, saved as a permanent snapshot, all the files that Git knew about at the time. Git knew about them because they were listed in its index.

The contents of the index change as you move from commit to commit. The act of checking out some commit fills Git's index from the commit you checked-out. The act of making a new commit makes the new commit from whatever is in Git's index. So in between these two steps, your job is to update Git's index.

This is a source of a lot of confusion, in Git. The index contains the files that Git is actually using.² Your working tree, where you can see files and get work done, holds copies of these files. These copies are for you to use. They are not Git's copies! They are yours. The copies that Git is going to use are in Git's index. When you run git commit or otherwise make a new commit, the index copies are the ones that Git will use.

When you change a working-tree copy of some file, you have to run git add file all the time. The reason for this is simple: git add means make the index copy match my updated working-tree copy. If the file was in the index before, well, now it's updated. If the file wasn't in the index at all before, now it is. So either way, after git add file, the index copy is updated.

What this all boils down to is a nice neat simplification: Git's index holds your proposed next commit. If you want to remove a file entirely, so that it is not going to be in the next commit, you simply run git rm file. This removes the file from both places: your working tree, where you can see and work with the file as a regular ordinary file, and Git's index, where a copy is being kept to use in the next commit.

The git merge command messes with this nice simple picture. Instead of the index holding only one copy of each file, now the index holds up to three copies of each file. These three copies come from the three commits that are being merged.

²Technically, the index holds file names—complete with forward slashes, such as src/a.js—and corresponding blob hash IDs. It also holds a bunch of cache data that help make Git go fast. The internal Git blob objects are all de-duplicated, so that files get shared across commits that re-use the same file content. This means that the index doesn't really hold files per se. But you can think of it as holding copies of files, in Git's internal format. This illusion only breaks down if you start using git update-index or git ls-files --stage to look directly that the content in the index.

How `git merge` uses Git's index

As a simplified picture—the actual one is more complicated—you can think of git merge as working this way:

Make sure the index and working tree are "clean", i.e., the HEAD copy of every file is the copy that's in both Git's index and your working tree.
Expand the index. Normally, every file in the index is in "slot zero". There are four slots per file, but normally slots 1, 2, and 3 are unused. Git now moves all the files from slot zero to slot 2, which Git sometimes calls --ours.
Copy the files from the merge base into slot 1, and copy the files from the other commit into slot 3. The index now has in it all three versions of each file from all three commits. Slot 1 is the merge base slot—it doesn't have a --base name, though perhaps it should—and slot 3 is sometimes called --theirs. (Slot 2 is of course --ours, as noted in step 2.)
When all the files in all three slots match, the merge for this file is super-trivial. Just drop the (single) file back down to slot zero, erasing the remaining slots: all three were the same anyway.
We didn't have all three files match, but what if two of them match? There are three cases here:
- Ours and theirs match (but are different from the slot-1 copy): we both made the same changes to the file, so use whichever one of these is convenient. Drop that one to slot zero and erase the other slots, and we're done.
- Theirs and the merge-base copy match: they didn't change the file, and we did. Use our version of the file: drop it from slot 2 to slot zero. Erase the other slots, and we're done.
- Ours and the merge-base copy match: we didn't change the file, and they did. Use their version of the file: drop it from slot 3 to slot zero. Copy it out to the working tree too, so that we can see the new file. Erase the other slots, and we're done.
The remaining cases are hard.

All the remaining cases require some real work, and when they strike, a lot more work has to happen. (The above cases get handled by special index-only code in Git. This is actually the source of one particular annoyance, having to do with low level merge drivers: they don't get run at all in any of the easy cases.)

Most commonly, for case 6, both we and they made some changes to one particular file. Git will attempt to combine these two sets of changes, and apply the combined changes to the merge base copy (in slot 1). If Git is able to do this combining on its own, Git will write the combined file to our working tree, move the combined file into slot zero, and erase the three higher-numbered slots. This merge conflict is now resolved: Git merged the file on its own.³

There are some cases, however, where Git either cannot or will not resolve the conflict on its own. This includes cases where both we and they changed the same lines, but made different changes. In these cases, Git will write, to the working tree copy of the file, a marked-up diff with <<<<<<< HEAD and >>>>>>> theirs lines added in the places Git was not able to resolve on its own. But it also includes what I like to call high-level conflicts: the opposite of a low-level conflict. Others call these tree conflicts. These high-level conflicts include cases like yours, where you changed a file, but they deleted the file.

For these cases, Git leaves as many copies in Git's index as make sense. In this case, it will leave the merge base copy of src/a.js in Git's index, and your HEAD copy of src/a.js in Git's index. It's important to read the message that Git printed:

CONFLICT (modify/delete): src/a.js deleted in aa1e4d and modified
in HEAD. Version HEAD of src/a.js left in tree.

This tells you what Git left in your working tree: the copy from the HEAD commit, i.e., from your current branch.

³When Git has resolved such a file, it has done so with simple text rules, on a line-by-line basis. Git has no understanding of files' contents. This means that even though Git resolved the conflict on its own, the result could be total nonsense. In practice, this actually works in a surprisingly large number of cases. When it doesn't, you can sometimes help Git out by writing your own low-level merge driver, although this is nontrivial.

The fact that Git is doing this with no actual understanding of the files is why it is important to test the results of a merge. Even if Git thinks it all went well, it might not have done so.

Your job is to fix up Git's index

Regardless of what conflict arose and what Git left behind in your working tree and Git's index, your job is now to fill in Git's index with the correct final merge result. Of course, in this case, Git's index is already partly or even mostly full of correct stuff. Here, Git's index has a copy of src/a.js in two slots: slot 1 (the merge base) and slot 2 (the --ours slot). You must adjust Git's index to have the correct file in slot zero, or to have no file in any slot if the correct result is to omit that file from the new commit.

The rest of Git's index is probably already all correct. If so, you need not do anything with any of those entries. It's just the entry for src/a.js that is messed-up and conflicted. It has a slot 1 entry and a slot 2 entry. Git itself does not care how you do this, but you must erase the higher numbered slots and put in a slot-zero entry, or erase all slots so that the file does not exist. The main two tools that Git gives you for this are git add and git rm.

Remember, git rm means erase from the index and remove the working-tree version. So if the right answer is "delete src/a.js entirely", you can just run git rm src/a.js. In your case, though, that's not the right answer:

I want to keep what is there

This means you want some version of src/a.js to be in slot zero. If you have the correct version in your working tree, remember that git add means make the index copy match the working tree copy. So:

I want to keep what is there in the HEAD

That's what Git left behind in the working tree. All you have to do, then, is run git add src/a.js, to take the working tree copy (which is also in slot 2 right now) and have Git write that to slot zero. Filling in, or erasing, slot zero also erases all the higher numbered slots.

So:

git add src/a.js

copies the working-tree copy of src/a.js to the index at slot zero and removes the entries in slots 1 and 2. There was no entry in slot 3 so this file is now resolved.

If you'd had a more-standard conflict, Git would have left all three copies in all three slots, plus its own attempt at merging, complete with conflict markers, in your working tree. In this case, you could edit the working tree copy to produce the correct result, and use git add to erase all three nonzero slots and provide a correct file in slot zero. But you didn't need to do that much work, because the working tree copy was already correct. It was just in the wrong slot!

Thanks torek, I basically had a choice of git add Vs git rm and I chose git add to get my change in.But thanks to your explanation I will be confident in handling this scenario in future. — Mythpills, Nov 21 '20 at 08:05