4

I am trying to understand which are the situations when a git conflict can happen after a git merge and how they could be avoided.

I've created a git repository and added a text file to it.

  • I've added "1" to the text file and committed it to master.
  • I've created a new branch from master(branch2) and appended a new line to the text file with the content "2".
  • I've created a new branch from master(branch3) and appended a new line to the text file with the content "3".

After this, I've done the following :

  • I've merged branch2 into master. No conflicts which is right and I've expected.
  • I've merged master into branch3. I had conflicts because the second line of the text file has different content. I've fixed the conflicts by keeping "3" instead of "2".
  • I want to merge branch3 into master. Now my questions are : 1. Are there any possibilities to have conflicts when I do that merge? If yes, why? If no, why? 2. If there shouldn't be conflicts, but I still have conflicts, what could be the reasons?
Adam Nierzad
  • 942
  • 6
  • 19

2 Answers2

15

Merge conflicts happen during merges, not after them.

The conflict part is really very simple: A conflict occurs in file F when:

  • "our" changes—the difference from the merge base commit to our HEAD commit—has a change to file F, and
  • "their" changes—the difference from the same merge base commit to their tip commit—also have a change to file F as well, and
  • our changes and their changes overlap (edit: or abut—see comments), but are not identical.

To understand this, you need to:

  1. understand the output of git diff; and
  2. understand what a merge base is.

The output of git diff is pretty straightforward, really, but it requires remembering that each commit holds a snapshot of all of your files. This means we must give git diff two snapshots: an old one, and a new one. That's two "pictures" of what the file were like at two points in time. Git then plays a game of Spot the Difference: it tells you that to go from the left-side snapshot to the right-side snapshot, you must make some set of changes to some set of files. These changes may involve renaming some files; they might involve adding new files; they might involve deleting files; and they might involve deleting some particular lines from some files, and adding some lines to some files, in some particular places.

The output of git diff is not necessarily anything any person did. It's just a set of changes that, if applied to the left-side snapshot, get you the right-side snapshot. The "left side" here is the left argument to git diff and the "right side" here is the right argument, when you use:

git diff <hash1> <hash2>

where the two hashes are the hash IDs of commits. (This is what git merge does, in effect, although it does all of this internally.) The diff engine is designed to produce the smallest set of changes that give the right effect. This, as it turns out, is usually what someone actually did do ... but not always; it's therefore usually right, but not always.

The last, but probably trickiest, part of understanding git merge is the concept of a merge base. Technically, the merge base is the (single) commit that emerges from an algorithm that finds the Lowest Common Ancestor (LCA) of nodes chosen from a a Directed Acyclic Graph (DAG). Not all DAG node pairs (or sets) have an LCA: some have none, and some have more than one. It's pretty common for your Git's commit graph to have a single LCA here, though, and git merge has some methods for dealing with multiple LCAs. (When there is no LCA, the modern git merge refuses to run by default, telling you that the two branches have unrelated histories. Old Git ran the merge anyway, and you can make modern Git do the merge anyway; in this case, Git uses a synthetic commit with no files as the merge base.)

The important part here is having a conceptual "feel" for the merge base. For some graphs, this is easy. Consider for instance the case of a Git commit graph where your two branches simply fork from a common ancestor commit whose hash ID is H:

          I--J   <-- branch1 (HEAD)
         /
...--G--H
         \
          K--L   <-- branch2

Here, when merging branch1 and branch2—which means commits J and L—the common starting point is clearly commit H. So git merge will run two git diff commands, and the merge base in each will be H:

git diff --find-renames <hash-of-H> <hash-of-J>    # what we changed on branch1
git diff --find-renames <hash-of-H> <hash-of-L>    # what they changed on branch2

Git will now combine the set of changes produced by these two git diff commands. Where they overlap, but don't make the same change, is where you will get merge conflicts.

Git will apply the combined changes to the snapshot in H. Applying your change to this snapshot results in commit J; applying their changes results in commit L; applying the combined changes results in, well, the combination.

If there are no conflicts, Git will be able to combine the changes on its own. Having applied the combined changes, Git will commit the result on its own, as a new merge commit M:

          I--J
         /    \
...--G--H      M   <-- branch1 (HEAD)
         \    /
          K--L   <-- branch2

and this will be your merge result.

If the combining fails, Git stops in the middle of the merge. Your job is now to finish the merge (combine the changes yourself), then tell Git you've done it and to the merge commit. If this is too big a mess, you can tell Git: abort the merge entirely and it will back out all of its attempts to combine things and leave you back on commit J, as if you'd never even run git merge at all.

The last tricky bit is this: when you do finish a merge—automatically through Git, or manually—the resulting merge commit records two parents. That is, if you look at merge M above, you'll see that it connects back to both commits J and L. In many merges we'd draw this a little differently:

                o--o   <-- small-feature
               /    \
...--o--B--o--D--o---o--o   <-- mainline
         \
          o--o--o--o--o--o   <-- big-feature

Here the small feature got merged into the mainline, and the big feature is still in progress. The merge base of the small feature was commit D. The merge base of the big feature will be commit B. (The rest of the commits are not very interesting.) In some cases, though, we get a more-tangled graph:

                  o--o---o   <-- offshoot-feature
                 /      / \
                o--o---o---o--o   <-- medium-feature
               /    \ /
...--o--o--o--o--o---o----o   <-- mainline

This graph isn't all that complicated, but it's really hard, now, to see where the merge bases are, because of all the cross merging from the various features into mainline and each other.

Git will find the merge bases. You can find merge bases yourself using git merge-base --all. You can draw the graph, or have Git draw it with git log --graph, and try to find merge bases by eyeball. Having found merge bases, however you did it, you can run the two git diff commands that git merge would run. This will tell you where your conflicts will be. But usually, there's no point: just run git merge and find the conflicts.

torek
  • 448,244
  • 59
  • 642
  • 775
  • 1
    One note -- changes will conflict if they overlap OR ARE ADJACENT. There must be at least one unchanged line between two changed chunks in order for them to not conflict. – Chris Dodd Dec 30 '21 at 22:54
  • @ChrisDodd: oops, right, I usually use the phrase "overlap or abut". – torek Dec 31 '21 at 01:49
  • There is a good document describing all the scenarios which result in a merge conflict: https://github.com/mndrix/merge-this `Changes of adjacent lines` is of the desctibed scenarios. – Aedvald Tseh Sep 08 '22 at 07:49
0

You shouldn't have any conflicts when you merge branch3 in to master, you would have done if you hadn't already merged master in to branch3. The reason is simply because you've added a commit to branch3 which resolves the conflict between it and master. So now merging to master can be fast-forwarded.

Adam Nierzad
  • 942
  • 6
  • 19
  • Thanks for your answer, but how about the second question? – Sam Stewart Feb 19 '20 at 10:23
  • I don't see any reason there would be conflicts in the scenario you describe. Are you able to include the conflict message? It may help to determine why there's an issue. – Adam Nierzad Feb 19 '20 at 10:49