Rebase onto Rebase/Squash

Question

This is a common situation I come across, and I'm looking for a clean solution.

Do some work on a git branch (my-first-work)
Push up to github
Start a new branch (some-further-work), based on top of the work my-first-work

Unspecified amount of time passes

Approver rebases or squashes my-first-work onto master
This creates new commits that git doesn't treat as equivalent to the stuff I'm based on, even though the end result is identical (i.e. the head of master is identical to the head of my-first-work)
I run git rebase master to move some-further-work onto master
The commits from my-first-work that I'm based on now all conflict with what's already been squash/merged

Currently, I get around this by using git rebase -i master and then removing all the commits up to the head of my-first-work. This replays the additional commits cleanly on top of master.

Is there a cleaner solution? And is there a way to make git automatically recognise when a rebase/squash (as in 4) has occurred?

score 3 · Accepted Answer · answered Mar 08 '19 at 16:37

Cleaner is in the eye (or hands?) of the beholder / worker, but you can use git rebase --onto to separate out which commits to copy from where to put the copies.

Remember that git rebase means:¹ I have a linear chain of commits as shown in drawing 1, that I would like to copy to a new linear chain of commits as shown in drawing 2. Once the copies are done I'd like my branch name to point to the last copied commit. The original A-B-C chain is no longer useful even if it's still there.

[drawing 1]
...--o--*--o   <-- upstream/master
         \
          A--B--C   <-- topic

[drawing 2]
...--o--*--o   <-- upstream/master
            \
             A'-B'-C'  <-- topic

The differences between the original commits and the copies are that the originals are based on commit *, which was the tip of upstream/master before. The copies are based on the (new) tip of upstream/master. The phrase based on here has literally two meanings: the parent of commit A is commit * but the parent of commit A' is the later commit, and the snapshot in commit A is *-plus-some-changes while the snapshot in commit A' adds the same changes to the later commit.

Since we use our shiny new commit A' (which has a different hash) in favor of the old dull A, we then need to copy B to B', and C to C', and once we are done we need our name topic to point not to C but to the last copied commit, C'.

Plain old git rebase does exactly this. We say:

git checkout topic; git rebase upstream/master

which tells Git:

Enumerate all the commits starting from C and working backwards. That's C then B then A then * then everything before *.
Enumerate all the commits starting from upstream/master and working backwards. That's the second o, then * and then everything before *.
Knock everything in the second list out of the first list. So that knocks out * and all the earlier commits. The second o isn't in the first list, but that's OK: we'd knock it out if it were in, but it's not so we do nothing. Our list now goes C, B, and A.
Reverse the list to put it into the right order, then, one at a time, copy each commit to the new place. The new place starts at the commit to which upstream/master points. So this copies A to A', B to B', and C to C'.
Peel the current branch name topic off its previous location, and stick it on the new chain of commits, at the end as usual. So this makes topic point to C' instead of to C.

In your new case, though, you had:

...--o--*   <-- upstream/master
         \
          A--B--C   <-- feature1
                 \
                  D--E--F--G   <-- feature2

They, in their upstream, didn't take your A-B-C chain. Instead, they made their own different ABC squash commit. You grabbed it from the upstream repository, so you now have:

...--o--*--ABC   <-- upstream/master
         \
          A--B--C   <-- feature1
                 \
                  D--E--F--G   <-- feature2

If you just run git checkout feature2; git rebase upstream/master, your Git will enumerate commits G-F-E-D-C-B-A-*-..., enumerate ABC-*-..., subtract the second from the first, and be left with instructions to copy the G-F-E-D-C-B-A chain.

The fancier rebase command is:

git checkout feature2
git rebase --onto upstream/master feature1

What this does is separate the target argument—the place where Git will start the copying—from the limit argument. The target is now upstream/master (the Git documentation calls this the onto argument). The limiting argument is now feature1. You can use the raw hash ID of commit C if you prefer. Git just needs to know: Where do I start my knock-out-these-commits enumeration? (Confusingly, the Git documentation calls this the upstream argument.)

As you can see, this now knocks out the C-B-A-* commits rather than just the * commit, so that after the copy, you have:

               D'-E'-F'-G'  [in progress]
              /
...--o--*--ABC   <-- upstream/master
         \
          A--B--C   <-- feature1
                 \
                  D--E--F--G   <-- feature2

and now Git can peel the label feature2 off G and stick it on G2 instead.

¹Technically, there's quite a lot more to git rebase, especially now with the fancy new --rebase-merges option. This I have a linear chain of commits is still its main use, though.

As a nice bonus, rebase can usually tell if they've taken your A-B-C chain and copied it to their own A'-B'-C' chain. But that's just usually. Rebase can never tell that they've taken your A-B-C and squashed it into their own ABC, so for that case you're stuck with --onto.

Fantastic. Really clear answer and a great explanation. – deworde Mar 08 '19 at 16:49 — deworde, Mar 08 '19 at 16:49

Rebase onto Rebase/Squash

1 Answers1