0

Someone asked why does a seemingly possible merge using git have conflicts with code similar to mine, but (1) mine is a little different and (2) I seek a solution to the problem not just an explanation.

edit: this is something I need to do for many files, very often. I also need to train non-technical people to be able to do it. Manually merging each conflict would be a deal-breaker for my use case.

A file on the master branch starts off as:

my name is dennis
i am a dolphin
i fix teeth

I make a branch named analysis_1 and commit these changes:

my name is dennis
analysis of line 1
i am a dolphin
analysis of line 2
i fix teeth
analysis of line 3

Realizing my earlier mistake, I checkout master and commit a fix:

my name is dennis
i am a dentist
i fix teeth

Is there a way (a strategy, algorithm, or GIT_EXTERNAL_DIFF tool) that can merge master into analysis_1 without a conflict so that I get to keep the correction and the analysis like the below?

my name is dennis
analysis of line 1
i am a dentist
analysis of line 2
i fix teeth
analysis of line 3

Thanks!

edit 2: This question pointed out wdiff and led me to try the diff option --word-diff which gives me the desired diff result. Next step is to get it to be used for a merge.

edit 3: Seems like word-based merging has some questions on SO that might bear fruit. I'll update this question if I find a solution in one of those that works for this.

Aaron Surrain
  • 366
  • 4
  • 11
  • You're going to need a custom merge strategy for this, because every standard merge algorithm (not just in Git, but that I'm aware of existing) is going to conflict here. – bk2204 Apr 20 '20 at 00:47
  • 2
    Why not just resolve the conflict? Conflicts are not a problem, they are merely git asking what to do. A merge with a conflict is not an impossible merge, it’s just a merge you have to help with. – matt Apr 20 '20 at 02:58
  • @bk2204: thanks. I'll look into custom merge strategies. – Aaron Surrain Apr 20 '20 at 13:35
  • @matt: this is just an example of something I need to do at scale without manual intervention. – Aaron Surrain Apr 20 '20 at 13:35
  • The problem in your example is that by the time you discover the problem, a line like "i am a dentist" in master does not correspond in any obvious way to any line in `analysis1`: no merge strategy can use semantics to say "oh, this is the line _about what I am_", computers don't think that way – matt Apr 20 '20 at 13:54
  • So if you are not willing to fix this manually, then Don't Do That - make the fix in `analysis_1` and merge back into master, not the other way round. If you do _that_, there is a merge strategy that works (in effect, theirs). – matt Apr 20 '20 at 13:55
  • @matt, "i am a dentist" is just one word different from "i am a dolphin" which I would argue is an obvious way they correspond to each other. What it seems to be choking on is that the small change is being grouped with the insertion. A computer doesn't need to grasp the semantics to see a big difference between lines it is comparing and just hop to the next line to see if it's a closer match. I just thought there would be an existing available merge strategy that did that, but the inserted line next to an altered line appears to be less common of a use case than I thought. – Aaron Surrain Apr 20 '20 at 14:35

1 Answers1

0

The problem is that you're trying to edit master by adding a commit whose job is to change the past, after analysis_1 has already branched off. The horse has left the barn and there's nothing you can do about it.

When you have made changes in analysis_1 and then you discover that one line brought over from master is wrong, the place to fix that line is analysis_1, not master.

If you are making such changes after the fact in master, or teaching others to do so, that is a misuse of git and you need to stop doing that.

matt
  • 515,959
  • 87
  • 875
  • 1,141
  • I disagree. The master branch has the source material that analysis branches reference. The analysis branches are for the analysis work. If an error is noticed in the source material, the source material's branch (master) needs the updates so the other analysis branches can benefit. The analysis branch may never be merged back into master. This is analogous to making a production bug fix to master (or a bugfix branch you merge quickly back into master) so you can leave the prerogative to merge in the correction to feature branches. – Aaron Surrain Apr 20 '20 at 14:25
  • OK but you are disagreeing with _git_, not with me. You have a picture in your head of how a diff can work. That picture does not correspond to any known reality. I'm just describing reality. – matt Apr 20 '20 at 15:35
  • I'm not making a generalization, I'm talking about the data you showed in your question. You _couldn't_ make and apply a patch for this change; a patch is a diff, and this change in `master` is not diffable in a way that can be applied to `analysis_1`. You would get the same problem you're getting now — a merge conflict. That is the whole point. – matt Apr 20 '20 at 17:53
  • thanks for clarifying your point and for sharing your thoughts on my question. I appreciate it. – Aaron Surrain Apr 20 '20 at 23:18