If I cherry-pick a commit from a branch and then merge the whole branch later what happens to the git history?

Question

So I've got a situation where I only want a particular commit from branchA because other commits in that branch aren't ready to be merged. So if I cherry-pick commitX from branchA into master, and then later merge branchA into master (with some commits in between let's say), what happens to commitX as far as the git history is concerned? Does it get ignored since it's already present in master, or does some sort of duplication occur?

torek · Accepted Answer · 2017-06-12T23:42:48.310

The git merge operation does not look at the individual commits along the "merge path", as it were. Instead, it looks only at the start and end snapshots. Note, however, that there are two end snapshots (and one start):

...--o--B--o--...--o--L
         \
          o--...--o--R

Here, B is the merge base commit and L and R are the left (or local or --ours) and right (or other, remote, or --theirs) commits respectively. Meanwhile each round o represents some commit that Git does not even look at.¹ Git simply does, in effect:

git diff --find-renames B L    # figure out what we did
git diff --find-renames B R    # figure out what they did

What this means for your git cherry-pick is that "what we did" and "what they did" are likely to include some of the same changes to the same files—but that's no problem, because the step that git merge does after obtaining these two en-masse diffs is to combine the changes, taking exactly one copy of any change that appears in both the left and right "sides" of the merge.

Suppose, however, that you cherry-pick a commit and then revert it again along one path, e.g.:

...--o--B--X'--!X'--L
         \
          X--o--R

(The reason X' has the tick mark is that it's a copy of commit X: when you cherry-pick a commit, the new copy gets a different hash ID.)

Here, you wrote commit X in for the branch (the lower line in the graph), cherry-picked it into the main line (the upper line), realized it was not ready or was broken, reverted it (adding !X'), and then made the final commit L in the main line. Now when you merge, the fact that a copy of X went in, then went out again, is invisible: Git compares B vs L and sees no sign at all of the X'--!X' sequence. It therefore still takes one copy of the changes introduced by commit X.

If commit X has become ready on the branch ending in R, this is the correct action.

If commit X is still broken, this is the wrong action—but the proper cure is probably to revert X on the branch before merging.

¹Except for finding B, that is: Git must start at both L and R and work backwards through the graph to find the merge base. This means Git must traverse some of the otherwise-uninteresting commit nodes.

Might want to also mention that the original commit is an entirely different commit than that of the cherry-picked one — Chris Rasys, Jun 12 '17 at 22:42
@ChrisRasys this was what I was wondering. If the commit IDs are different then the git history will indeed consist of two commits that are the same changes. One being the cherry-picked commit and then the original commit that gets merged at a later date? — , Jun 12 '17 at 22:44
@ChrisRasys: done. @sreya: yes; but this would also happen if you copied the commit by some other means, such as making the same change again. A viewer that shows the commits along their graph lines (as `gitk` does, and most GUIs do, and of course I did above) will show you that the two commits were in different lines of development that were later merged. — torek, Jun 12 '17 at 23:23

AnoE · Answer 2 · 2017-06-13T10:38:49.230

Nothing in particular happens, but let's break it down.

         M
-o---o---o master
  \
   \----o--o--o branchA
           X

Now you do your cherry-pick. Then the situation is:

         M  MX
-o---o---o--o master
  \
   \----o--o--o branchA
           X

So far so good. Then you do a few more commits...

         M  MX
-o---o---o---o---o---o---o master
  \
   \----o---o---o---o---o---o branchA
            X

Now the merge...

         M  MX
-o---o---o---o---o---o---o-----o master
  \                           /
   \----o---o---o---o---o---o branchA
            X

All of this is just business like usual. Git does not store the fact that commit MXis the result of a cherry-pick, and it does not need to. The cherry-pick operation differs from a merge in that the picked commit X and the new commit MX are in no relationship which each other, whatsoever. They cannot be, either, because (with a merge) the "parent-child" relationship has the semantic that at the end master contains all history of branchA, not only the change introduced a single commit.

The actual changes, on a file level, just work as if you had edited them in manually. I.e., if the changes introduced by the cherry-pick stick around in master, git will notice that (by not noticing any difference during the merge on the relevant lines, in easy cases) and things will just be merged.

EDIT: Regarding your question in the comments...

lets say my commitX had a message "Foobarbaz". Given the scenario, would I have two commits now in my master branch with commit messages "Foorbarbaz" or just 1.

Each commit has a message at the commit level and some content at the file level. A cherry-pick works only at the content level; that is, it takes the file changes from one commit, and applies it to whatever is in your working directory right now. What may be confusing is that the command git cherry-pick does indeed, after applying that change, create a new commit for you (MX in this example). This new commit is just a plain old commit though - it is in no way related to the original commit X, except git cherry-pick copies the old commit message (which you can edit) as a convenience.

As a clarification, you could do git cherry-pick -n and avoid git doing the commit for you - this would give you the chance to edit whatever the cherry-pick did, before committing it yourself.

So, the cherry-pick is literally simply a convenience method which works like if you had edited the changes in yourself and committed them yourself. The fact that the new commit message may be similar or equal to the old one does not matter to git at all, on the merge, later.

agreed on the file level changes, that much makes sense. But lets say my commitX had a message "Foobarbaz". Given the scenario, would I have two commits now in my master branch with commit messages "Foorbarbaz" or just 1. I'm aware that at least functionally, what I'm proposing won't cause any harm but I'm still curious how the history will look. From your diagram it seems like only 1 commit would exist in the master — , Jun 12 '17 at 23:03
@sreya: The commits "in" a branch are those *reachable from* the branch name. To see if we can reach a commit, we must follow *all* parents of each merge commit. So from `master`, we look back at both commits on the top line (because of the connection back from the merge in that direction) and on the bottom line (because of the second connection there). We thus find both MX and X are contained within `master`. This makes Git radically different from many other VCSes, where a commit is only on the branch you make it on originally—in Git, the set of branches that contain a commit *changes!* — torek, Jun 12 '17 at 23:26

If I cherry-pick a commit from a branch and then merge the whole branch later what happens to the git history?

2 Answers2