How to detect whether a git merge would result in an empty commit, given two arbitrary commits?

Question

If I use LibGit2Sharp's "Diff" feature to compare two branch commits, the diff has changes regardless of the order in which I compare them.

var newBranch = repository.Branches[currentBranchName];
var oldBranch = repository.Branches[name];

var patch = repository.Diff.Compare<Patch>(oldBranch.Tip.Tree, newBranch.Tip.Tree);
var patchChanges = patch.Count();

If I swap newBranch and oldBranch in the above code, an identical/opposite patch is returned. For example, if the first patch is something like +300 -200, then diffing the branches in reverse order will result in a patch with -300 +200.

On the other hand, if I merge one branch into another, there might be NO new changes introduced (i.e. an empty merge commit will result), whereas if I merge in the opposite order there WILL be changes.

I'm really just trying to detect whether one branch is missing hotfixes from another, which is the same as answering the question of whether merging one branch into another would result in an empty merge commit. I can get the ahead/behind counts, but often, when one branch is behind another by one commit, that commit is an empty merge commit, so it's not actually behind (i.e. a new merge commit would be empty), so the behind by count doesn't actually tell me whether hotfixed changes are missing. I thought maybe I could just diff the two commits, but as I discovered, that always has equal/opposite changes in both directions, whereas I need some operation that will detect that there are no changes in one specific direction (i.e. the "ahead" commit is empty). I cannot find anything in LibGit2Sharp that will tell me whether a merge commit is empty.

score 4 · Answer 1 · answered Dec 20 '18 at 06:35

An empty merge can happen, but, among other things, it requires defining what we mean by empty merge.

A merge has three inputs, not two. A diff has two inputs, not three.

When Git performs a merge—a true merge, of the usual kind, not an octopus merge, not a fast-forward operation, or some of the other things that git merge can do, but a true three-way merge with two branch tips and a merge base—Git has to run two git diff operations, not one, because each git diff compares just two trees, and we need to compare three.

The first of the three inputs is neither branch tip. Instead, it is the merge base commit between the two tips:

          C--D--E   <-- ourbranch (HEAD)
         /
...--A--B
         \
          F--G   <-- theirbranch

Here, each uppercase letter stands in for an actual commit hash ID. To do the merge, Git must:

locate commit B, the merge base
compare B to E: what did we change?
compare B to G: what did they change?
combine these changes!

When the merge is done, assuming all goes well, Git makes a new commit that has two parents instead of just one:

          C--D--E
         /       \
...--A--B         H   <-- ourbranch (HEAD)
         \       /
          F-----G   <-- theirbranch

If, after the fact of the merge, you compare H vs E, you will see the changes that came in via theirbranch: B vs G, minus anything that was already in B vs E. If you compare H vs G, you will see the changes that came in via ourbranch: B vs E, minus anything that was duplicated in B vs G.

If we define an empty merge as one where H vs E produces no difference (i.e., that the snapshot in E matches that in E) and H vs G produces no difference, then E must have matched G. (The contents of B become irrelevant in this definition!)

If we define an empty merge as one where H vs E produces no difference, regardless of whether H vs G produces some difference, then there are more ways to get here. One is to use -s ours when running git merge, as that instruct Git to ignore the second diff entirely (and in fact, not bother running it): just use G as the snapshot for H, while still making the history linkage connect H backwards to both E and G. Another is to ensure that whatever is in the B vs G diff is simply a subset of whatever is in the B vs H diff, so that the process of combining the two diffs results in just taking the B-vs-H diff in the end.

If we define an empty merge as one in which H matches B, the constraints are even stronger. The only natural way to get this is for E to also match B, and, if we did not use -s ours, for G to match B as well.

(Note that we can run git merge --no-commit to stop Git from automatically committing the merge result. In this case, we can subsequently mess with the contents of the index—the thing from which Git will make the next commit—so that we can construct the tree from H to look any way we like, regardless of what is in B, E, and/or G. But I exclude this from any normal setup.)

Hence, if you want to pre-compute what a merge will do, your job is this:

Find the merge base commit B. (Consider also what happens if there is no common commit, or if there is more than one best common commit, between the two histories derived from walking backwards from the two branch tips.)
Having located B, or made a recursive merge the way git merge -s recursive does for the multiple-merge-base case, run the two diff operations that Git would. Remember to enable rename detection.
Combine the two diffs.

(Not included here: when both branch tips modify the file with respect to the merge base, Git will use any merge drivers selected in .gitattributes files. If you want to emulate merge, you should do this, too.)

It's generally much simpler to just do the merge and see what happens. If you don't want to update the current branch, detach HEAD before doing the merge, or do the merge with --no-commit and then use git merge --abort to stop the merge and reset.

How to detect whether a git merge would result in an empty commit, given two arbitrary commits?

1 Answers1