0

Say I have the following situation in Git.

Basically, someone has branches out from master and then merged to their branch (one or more times).

*   <- 'master'
| * <- 'topic'
| *
|/|
* |
| * <- 'topic' created
|/
*

Can I use Git revision ranges to obain the commits on 'topic' since the last common ancester between 'master' and 'topic'. In this case, it would be 'topic' and 'topic~'.

Background: I'm trying to create one new (non-merge) commit on 'master' with all the changes from 'topic' while avoiding to resolve conflicts that have already been resolved on 'topic'. I'm thinking in the lines of "format-patch" and then "am" or maybe "cherry-pick" with "--no-commit". Rebasing gives me challenges because of the merge commits.

Thanks

jgreen81
  • 705
  • 1
  • 6
  • 14

1 Answers1

1

There are some cases where the answer is just you can't do that. This isn't one, but spelling it out with gitrevisions syntax is tricky.

It probably also won't help your ultimate goal:

I'm trying to create one new (non-merge) commit on 'master' with all the changes from 'topic' ...

You probably want:

git checkout master
git merge --squash topic

after which the only sensible thing to do with the name topic is to delete it entirely.

Onwards, to gitrevisions

Let's assign letter-names to each of the commits in your drawing. Here is your graph drawing again—note that I'm assuming this is trimmed or hand-edited git log --graph --oneline output:

F   <- 'master'
| E <- 'topic'
| D
|/|
C |
| B <- 'topic' created
|/
A

I simply replaced each * with a letter, so that we can calk about each commit.

Commit F is reachable only from master. It has a single parent, C. Commit E is reachable only from topic and has a single parent D. Commit D is reachable only from topic but is a merge commit with parents C and B, with C being the first parent.1 Commit C itself is an ordinary single-parent commit with parent A; commit B is an ordinary single-parent commit with parent A; and commit A is a root commit with no parent at all.

The diagram is perhaps misleading because the name topic was most likely created when the name master identified commit A, and at that time, topic also named commit A. So "topic created" should point to A, even though A is clearly a straight shot down from the tip of master: we start at commit F, go to commit C, and go to commit A.

Selecting commits E, D, and B is easy: that's just topic ^master, which you can spell as master..topic. The set of commits selected by the name master is {F, C, A}. The set of commits selected by the name topic is {E, D, C, B, A}. Subtracting (or excluding) the first set from the second produces the set {E, D, B}.

To select commits E and D, consider, e.g.:

git rev-list topic ^topic^^@

That is, we want revisions reachable from topic{E, D, C, B, A} as we already saw—but excluding all the parents of merge commit D. The syntax rev^@ means exactly that, all parents of the given revision, so ^rev^@ means exclude all parents of the given revision. In this case the rev part is topic^, so we end up with ^topic^^@. Each of these hat or caret ^ symbols is doing something different:

  • the leading ^ means not;
  • the ^ just after topic, in topic^, means the first parent; and
  • the ^@ sequence right at the end means all the parents of.

Note that you can also spell this ^topic~1^@ or ^topic~^@.

To select commits E and B but exclude D entirely, you can use the fact that when git rev-list walks a graph, it lets you exclude commits from its output based on certain criteria. Since we know that master..topic walks E, D, and B, we can use that syntax but add --no-merges. Not every command supports this syntax, but git rev-list does and you can use git rev-list itself to generate a list of raw hash IDs, which you can then supply to most Git commands.

Last, consider simply listing out each commit you want, by name or relative specifier, and adding --no-walk:

git rev-list --no-walk topic topic~2

for instance means commits E and C. topic itself means commit E. The tilde and number suffix counts backwards some number of first-parent steps. The first and only parent of E is D, and D has two parents. The first parent of D is C and the second parent of D is B, so topic^^ or topic~2 means commit C.

To select B by relative name, we want the second parent of the first parent of E, so topic^^2 also does the trick. That is, topic^—or if we like, topic^1 or topic~ or topic~1—initially selects commit D, but adding the suffix ^2 to that immediately moves on to the second parent of D which is B.

(If the parents were in the normal order—so that D's first parent were B and its second were C—we'd use different relative names.)

In all of these cases, the --no-walk means that each of our selectors should not select that commit and its ancestors, but rather, just that one commit.


1This is unusual: normally C would be the second parent. git log --graph --oneline would draw this second-parent-ness as:

F   <- 'master'
| E <- 'topic'
| D
| |\
| |/
|/|
C |
| B <- 'topic' created
|/
A

That is, git log --graph always places the first parent to the left and other parents to the right, even if that means it must immediately re-divert to the left in a subsequent line.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thanks a lot torek. I'll try the "merge --squash". I've worked with Git on/of for a couple of years now and I've totally missed that option. Regarding the rest of your thorough answer, you guess correct regarding the graph. I'll never change parent ordering again. Didn't know the rule about how Git draws first parents to the left etc. In fact, I'll just copy-paste next time, I guess :) Are your idea regarding the expression for getting commit E and D valid input for format-patch? I don't think so since the doc says it has to be a , right? – jgreen81 May 14 '20 at 05:52
  • 1
    `git format-patch` is kind of a special case because it's one of the oldest Git commands. It also cannot format merges (!) so those are out anyway. One trick is to repeatedly invoke it on one commit at a time with X^..X each time. As for the first vs non-first commit at a merge boundary: a lot of time it doesn't matter, but sometimes it does, and tricky range expressions are one of the times it does :-) – torek May 14 '20 at 06:52