1

The problem

See the image. I had a master branch. At commit A, I branched dev branch from it. At point B I synced dev with master, creating M1. At point C our team branched release branch from it. In future, I will need to merge dev back to release.

Unfortunately, I accidentally merged commit D from master to my dev branch, creating M2. Now I cannot merge dev to release since it contains commits C..D, which belong to master and should not go to release.

My development is not over, and I'm not going to merge dev to release right now. However, I want to stay synced and merge release to my dev branch. I expect such merges to happen several more times in future, before I finish development and merge dev to release.

Thus, at some point I would need to revert M2 from dev. I want to do it as early as possible, since commits from release may conflict with changes in M2. Remember that some of these changes do not exist in release.

Since I want to revert M2 as early as possible, I would like to do it before I merge from release to dev. That's where the problem actually begins. Thank you for reading to this point :)

Problem

I can revert M2 in dev, this is not a problem. However, after that, when I merge release to dev, git computes merge base as C. While I want merge base to be B, to pretend like merge M2 just didn't happen at all. Because of this incorrect base, changes B..C are actually automatically discarded by the merge. Git thinks, I manually removed them, since this is what he sees in revert commit of M2!

Let me clarify this. Imagine someone created file foo.txt in C. This file will be added to dev with M2. And it will be removed from dev once I revert M2. When I merge release to dev, git sees foo.txt in release, it sees foo.txt in base commit C, but it does not see foo.txt in dev. Thus git thinks I removed foo.txt. But I didn't do it.

There would be no problem if I specified B as the merge base. Is there a way to do it in git?

My solution

Since I found no way to override merge base, I did a little hack. I post it here because I had another related question.

See second image. I started new local branch tmp from E, commit before M2. I cherry-picked to this branch all changes from dev except for M2. I merged release to tmp and commited result M3 with only one parent from tmp (note dashed line on the image).

I reverted M2 in dev, commit G. I created "fake" merge from release to dev. This commit contained no changes, but had two parents: from both dev and release. I then cherry-picked M3 to dev and squashed it with my empty fake merge. I thus created M4, with correct changes and with correct parents.

enter image description here

The question is: did I actually need this "fake" merge? Maybe there is a way to cherry-pick M3 to dev and make it to have two parents?

Questions

I'll summarize questions here:

  • Is there a way to manually set base during merge?
  • Is there a way to manually specify parents for a commit?

Thank you if you were able to read through this!

Update

As I realized after the discussion, my solution (or any other solution to the stated problem) has a crucial defect. As I've said, at some point I will merge dev to release. As I have not said, at some point after that we will merge release to master. And at this point we will face exactly the same problem. The base for this merge will be resolved to D, not to C. And this will lead to the similar problem, but at a much greater scale.

Thus, the best solution to this problem is to continue development in tmp, and make it the new dev, effectively excluding bad merge M2 from the history.

Mikhail
  • 20,685
  • 7
  • 70
  • 146

3 Answers3

3

The short answer is "no, you can't pick your own merge base" and your temp branch approach was correct. So now let's take the second question:

Is there a way to manually specify parents for a commit?

The answer here is a resounding yes, but you must use Git's "plumbing tools" to do it, specifically git commit-tree.

When git commit creates a new commit, it does the following (after, of course, collecting up your commit message and so on):

  1. Write the index to a tree: git write-tree. This prints out the new tree's hash ID.
  2. Use the resulting tree to create a new commit: git commit-tree args (see below). This prints out the new commit's hash ID.
  3. Update the current branch to point to the new commit: git update-ref -m "commit: subject" HEAD commit-hash.

Step 2 is where we may choose the parents for the commit. The commit-tree command takes one required argument, which is the tree ID (as produced by step 1), and one -p argument per parent ID to set. Each -p takes a commit parent specifier. This can be a raw hash, or anything acceptable to gitrevisions. In fact, the tree ID can be written this way as well, which is handy if you just want to create a new commit using the same source tree as some existing commit.

Hence, suppose you have a current source tree, under branch X, pointing to commit CX. You now want merge X into branch dest, but you want the merge to behave as though the parent commit for CX were commit P, so that git will use git merge-base --all P dest to find the merge base. (If you already know the desired merge base B, you can simply choose P = B.)

What we need is a commit whose tree is the same as CX but whose (single) parent is P. This commit does not even have to be on any branch at all, it just needs to exist in the repository. If you want it to be on a branch, use the git branch in the recipe:

X=dev                                        # name of desired branch
P=<hash ID for commit P>
git show $P                                  # just to check
cat > /tmp/msg << END
temp commit of $X's tree from $(git rev-parse $X)

This is a special commit made with parent $P
just for doing a merge.
END
newcommit=$(git commit-tree -p $P $X^{tree}) < /tmp/msg
# git branch tmp $newcommit                  # save newcommit as new branch
git checkout dest
git merge $newcommit                         # or git merge tmp

(the above is all untested).

Using $X^{tree} here, there is no need to do a lot of complicated cherry-picking. Of course if you need to leave out particular commits to build a new (different) tree, you really do need a temp branch and to do some cherry-picking.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Wow, thank you for explanation! I feel like my original approach was simpler though :) – Mikhail Nov 30 '16 at 16:03
  • Yes, all of the above is only really useful for special-case one-offs. For ongoing development you need to avoid having to remember which commits to include and which to omit, and in the end, that means constructing new replacement branches (e.g., turning tmp into the real development line, and/or swapping the names around). – torek Nov 30 '16 at 16:08
1

I'm pretty sure the short answer for both your bottom line questions is no:

  • You definitely can't manually set a base for a commit, and in any case, that doesn't make any sense. How would GIT know to relate existing commits in your 'us' branch to those in the other base? (suppose for example it's on another branch out after C, just to make it hard).
  • This is kind of possible using git reset, though it's a workaround. You git reset <wanted base>. All the changes between your current commit and the new base are uncommitted in the work area, so you can make a new commit that you want (though you lose some tree information you may have wanted).

In any case you can stich your dev branch around M1 to get rid of it:

git checkout dev
git rebase -i <commit prior to M1 or even to A>

The -i will allow to remove the merge commit.

EDIT

Your workaround's final commit in dev may seem like what you want, but you have to remember in git, if there's a line below connecting to you then those commits are a part of you as well. What you did was elaborately rebased to delete M2, so commits C..D would not be taken into account in any parent.

If you could tell git to just merge dev torelease based at B, then the following I think is a good estimate of what would happen:

  1. GIT sees B..C are the shared, and just skips them.
  2. Additions between M1 and M2 are merged to dev
  3. Still have M2 here! so merge C..D to dev. Doesn't matter at all that you started from B, you will get here!
  4. and continue the rest.

There is no getting around C..D if M2 exists in the hierarchy. But, you can skip it by reverting or rebasing, or doing that crazy branch bypass you did.

kabanus
  • 24,623
  • 6
  • 41
  • 74
  • 1. Ok, I understand this is not possible in general, but it seems perfectly legal to set base to `B`, doesn't it? 2. Ok, need to think about it for a little 3. Yes, this is kind of what I've done in `tmp` branch with cherry-picking, right? – Mikhail Nov 30 '16 at 13:28
  • 1.There is a problem, because C..D must be taken into account (M2), in case you merge master to release one day. You have to consider GIT doesn't 'know' what happened, and has to keep track of the entire tree. If you could select B as your base, git merge will just recognize B..C are shared in both branches and skip them anyway. 2. I think you got it. This isn't really a hack, just a fix to a problem. I think rebase may be slightly easier, but still you're essentially remaking the branch. Also, how do you break lines in comments? – kabanus Nov 30 '16 at 13:47
  • 1. I still don't get it. Because with my solution that's what I effectively did. I did all the merge by myself and then just provided git with the content of the merge I did. How is it different from when git does what I need in the first place? If I merge `master` to `release` one day I will have some troubles anyway, since the merge `M2` effectively didn't go anywhere. // I insert it with Shift+Return, but they seem to disappear once I post the comment. Do you see them?.. – Mikhail Nov 30 '16 at 15:04
  • Then @Mikhail got really lucky with the list in his comments :) I'll try and edit to post a clarification in my answer. – kabanus Nov 30 '16 at 15:16
  • What do you mean by "But, you can skip it by reverting"? Please note, that I assume that I do revert of `M2` before merging `dev` to `release` in any case. – Mikhail Nov 30 '16 at 15:43
  • Actually, I just realized, that there will be another really disastrous consequence of my actions: once I decide to merge `release` to `master` (and that's what we usually do), they will have two common ancestors: `D` and `C`, and as far as I can tell from the docs, `D` will be considered the _best common ancestor_. So the story will repeat. With _much_ more complications. Thank you very much for this! Now I think we will just rebase and continue with `tmp` as the new branch for `dev`. – Mikhail Nov 30 '16 at 15:45
  • Just `reset --force` your dev to there (you may have meant this :)). Good luck – kabanus Nov 30 '16 at 15:48
  • I'm accepting this answer, because you helped me to realize the proper thing to do :) – Mikhail Dec 01 '16 at 08:33
0

I think you can do interactive rebase in branch dev and remove D and M2 from it.

$ git checkout dev
$ git rebase -i <commit-sha-of-E>

A new window will be appeared. Now just delete the commit lines for D and M2, then save and exit.

$ git rebase --continue           # finish your rebase.   
Sajib Khan
  • 22,878
  • 9
  • 63
  • 73
  • Yes, but interactive rebase will create new set of commits, and that's what I've done in my `tmp` branch. The question is, how to fix things in `dev`. – Mikhail Nov 30 '16 at 13:30