Mercurial merge strategy vs Git merge strategy

Asked Mar 29 '18 at 16:13

Active Mar 29 '18 at 16:38

Viewed 1,886 times

I've used git for years, and recently switched to mercurial for a project. I've learned how to use Mercurial quite well via the command line over the past 6 months.

This could be my imagination, but it seems to me that mercurial is much worse at merging, and results in way more conflicted files. I'll merge the default branch into my feature branches often, and it will sometimes do really funky things, and fail to automatically merge files that seem like they should visually merge fine - I.E. no changes on the same lines, etc.

I've done quite a bit of research to see what may differ in the merge algorithms, with very little luck. Most articles are people's opinions and information about how git and mercurial work under the hood, without much focus on the merge algorithms themselves and upsides/downsides with plain-language examples of the differences.

I always use good merge strategies and merge up through the tree, and never down into the default(hg)/master(git) branch without first merging up to the remote to ensure there are no conflicts.

What I've found so far in my research is:

1) Mercurial cannot merge or has issues merging with multiple parents. I'm not sure how someone ends up in this situation, but maybe it is common?

Is this true? Would this cause merge conflicts more often in every-day development?

2) Mercurial doesn't support octopus merging, where git does.

For octopus merging, I say "who cares!", this isn't a necessity.

Other than that, it seems that the merge algorithms are created equally? Is it possible to change the merging algorithms? Are there any great articles on this?

If you post info on merge tools like k3diff, p4merge, and meld, you're missing the point - I want info on the auto-merge strategies before conflict resolution.

Thanks for any helpful references and/or information!

asked Mar 29 '18 at 16:13

TheJeff

3,665
34
52

1

How's your #1 differs from #2? Merge with multiple parents is an octopus merge. – max630 Mar 29 '18 at 16:41
Could you provide some examples? – max630 Mar 29 '18 at 16:42
1

@max630: I'm pretty sure he means "multiple merge bases", not "multiple parents", in item #1 (and that's what I addressed in my answer). – torek Mar 29 '18 at 18:20
Nice question on an interesting topic. – ryanwc Oct 13 '18 at 01:57
I've found that paying attention to line endings, auto formatting and whitespace is important. Everywhere from considering your IDE to your operating system can impact the interpretation of literal data by the git program. I'd recommend different steps for mac programs vs windows programs vs multi-os software projects. Git's auto-crlf setting = true doesn't work so well on projects shared between mac and windows for example. I have NOT switched back to mercurial to do an analysis of similar issues or preference options since this post. – TheJeff Dec 06 '20 at 01:39

1 Answers1

1) Mercurial cannot merge or has issues merging with multiple parents. I'm not sure how someone ends up in this situation, but maybe it is common?

Is this true? Would this cause merge conflicts more often in every-day development?

No, it's not true—at least as claimed, which seems to be a bit meaningless.

The underlying issue in doing commit-graph-based merging has to do with finding the Lowest Common Ancestor or LCA. In a tree, there is always a single LCA, so it's the obvious input to a three-way merge: it's the base in the usual base commit, left-side / local / --ours commit, right-side / remote / --theirs operation.

In a commit DAG, however, there may be more than one LCA node. Mercurial's default solution to this is to pick one more or less arbitrarily. Git's default solution is to pick all of them and merge them, using the -s recursive strategy. This "inner" merge results in a single final commit, which Git then uses as the merge base. You can override this to do the same thing Mercurial does using -s resolve: pick one more or less arbitrarily, and use that as a base.

Mercurial has several experimental alternative merge strategies (see, e.g., BidMerge) but none are included "out of the box", unlike Git's four -s strategies.

Multiple merge bases occurs primarily when someone does a "criss-cross merge". See How do criss-cross merges arise in Git? In some work flows this should never happen, and in practice it's not all that common.

2) Mercurial doesn't support octopus merging, where git does.

For octopus merging, I say "who cares!", this isn't a necessity.

That's correct. Any octopus merge can be simulated with a series of pairwise merges. They are particularly good for showing off, though. :-)

Other than that, it seems that the merge algorithms are created equally?

No, because Mercurial and Git use different algorithms for tracking file names. The issue here is this: once you have the three inputs to the three-way merge, who says that file path/to/f in the base is the same file as path2/f2 in the left side and/or path3/f3 in the right side? Which file(s) shall we pair up, or identify as I like to call it?

Mercurial's answer to this is to track file identity through the manifest and recorded directory operations (recorded renames or copies), while Git's is to determine file identity dynamically via content-matching. However, full dynamic determination is too expensive computationally, so Git cheats: if two files have the same path in base-vs-left or base-vs-right, those two files are identified as "the same" file. This leaves only path names with no pairing to identify dynamically.

One must also deal with which path-name to use in the final result. Here Mercurial makes you choose while the merge command runs, while Git simply stuffs all the names into its index, allowing deferred name-choosing afterward.

Once appropriately identified and named, though, the merge process itself is the same: find out which side(s) changed which file(s). If only one side changed a file, use that side's version. Otherwise, do a file level merge (Git calls this a low level merge internally) on the three inputs. This requires either computing diffs or following and combining individual changesets, and Git and Mercurial both choose the straight "diff base against tip" method. (Since Git always stores snapshots, it's kind of forced this way. Mercurial sometimes stores snapshots, so it too is kind of forced.) Their internal diff engines are not identical either, though, so this too can produce somewhat different results.

Is it possible to change the merging algorithms? Are there any great articles on this?

Yes: Git has -s arguments, and Mercurial is all pluggable internally.

No, as far as I know. I am working on a book, except that I am not actively working these days, having a different job, and it's not aimed specifically at these; but the theory chapters (which are at least somewhat close to done) give the appropriate background.

edited Mar 29 '18 at 16:38

answered Mar 29 '18 at 16:33

torek

448,244
59
642
775