How do I squash commits when I have merged with dev in between?

Question

I have worked on a feature branch for a few days and it's now ready to merge into dev. While working on this feature, I have merged with dev to receive a patch. My history looks like this:

* E (feature1)
* D:merge with dev
|  \ 
* C * B:patch (dev)
  \ | 
    * A

I'd like to squash the whole branch into one commit, merge with dev and then fast-forward dev. The problem is, E can't be squashed with C as the merge comes between them. The only option seems to be to squash E, B and C (call the new commit F), in which case the squashed commit will also include changes that were part of an irrelevant patch. Once merged into dev, there will be two commits that apply the patch: F (which applies the patch and adds the feature) and B (which only applies the patch). Besides, F will now be making two unrelared changes.

Is there a way out of this that keeps my history nice and clean? Do I need to change my workflow?

Recommended workflow change: never merge dev. Instead rebase onto it. You should be able to fix this by simply doing a rebase onto `dev` now. — o11c, Oct 12 '17 at 02:38
@o11c - Rebasing changes your branch history. While I happen to be a fan, it can be a bit dangerous at times, especially if you don't know what you're doing. Also, rebasing gets really annoying when there are conflicts, because you have to re-resolve conflicts every single time you rebase, while with a merge you can address the conflicts once and then move forward. — JDB, Oct 20 '17 at 04:23

JDB · Answer 1 · 2017-10-20T18:09:21.327

Git doesn't track changes in commits. Each commit contains a full copy of the files in the repo. "Changes" are determined by diff-ing two commits.

So, in short, there's absolutely no problem with squashing E, B and C into commit F and then merging that onto the dev branch. The B commit will still exist on the dev branch. When git compares F to B, only the changes introduced by C and E will be attributed to F.

You can, of course, get around this perceived problem by using git rebase, but that comes with its own set of headaches. For example, since git rebase changes your branch history (moving your commits on top of the latest commit from dev), if there are conflicts, you will have to re-resolve those conflicts every time you rebase. That gets old fast if your issue takes a while to resolve, requiring several rebases to keep current with dev.

Just to demonstrate that this is all true and that you have nothing to worry about, I've setup an example GitHub repo: https://github.com/cyborgx37/sandbox

To start with, we have the dev branch, which has the B commit.

B:patch
|
A

Then I created the feature1 branch, which has commits C, D and E. (Note that, because D was a merge and thus has two parents, B also shows up in the commit history)

E
|
D:merge with dev
|\
C \
|  B
A

Finally, there's the dev-with-feature1 branch.

F
|
B:patch
|
A

I created this branch off of dev, then used

git merge --squash feature1
git commit -m "F"

to squash all of feature1's commits into a single commit, F.

If you examine the diff for the F commit, you'll see that it doesn't "apply the patch". Since B already contains those changes, and F just repeats them, git doesn't associate them with F.

From another perspective, here's the blame:

initial     hello!
B:patch     patch!
F           new feature!!!

Git doesn't track changes in commits. It stores your complete project state. "Changes" are determined by comparing a commit to its parent (or predecessor) commit. Because B has the patch, git associates that change with B. So, it's expected that the patch would be in F too. The only way it wouldn't be in F is if you deleted the patch.

Think about it this way: the commit message for F says it implements feature1. But in reality, when you move from F^ to F in dev, you're getting the patch as well. Basically the commit message is lying. — lfk, Oct 18 '17 at 00:24
Maybe I'm not understanding correctly. Are you planning to remove `B` from the dev branch? If not, then there's no problem. `F` *should* include `B`, because `F` comes after `B`. `F` will contain a full copy of all file states, so when you try to determine what changes were introduced by `F`, git will compare `F` to `B` and give you the diff. Since the patch was already in `B`, git will not consider it as "belonging" to `F`. — JDB, Oct 18 '17 at 00:49
@Farshid - Think of git like a folder full of zip files. Each zip file has a date stamp, so you can track the order. Each zip file contains the full source code as it appeared on that date. If you want to know what was introduced on any particular day, you compare that day's zip to the previous zip. That's basically how git works... each commit is a full copy of all file states, **not** a record of changes. — JDB, Oct 18 '17 at 00:52
@Farshid - If you are worried about it, though... clone your branches and experiment. Checkout `feature1` then run `git checkout -b feature1-clone` then checkout dev and run `git checkout -b dev-clone` then try it out and see if it's the result you were looking for. — JDB, Oct 18 '17 at 00:54
Let's say I squashed E, B and C in feature1. Then we will have something like this: A--B (dev) and also A--F (feature1) (couldn't draw a proper tree in a comment) . While chronologically F comes after B, since it includes changes introduced in B, it should have B as an ancestor at some point. In our scenario that is not the case. — lfk, Oct 19 '17 at 22:42
@Farshid - I've updated my answer with an actual example on GitHub. You keep thinking that commits "apply" "changes", but they don't. A commit is just a snapshot of your entire codebase. Git figures out "what changed" by comparing a commit to the previous commit. You can see that at work in my example. — JDB, Oct 20 '17 at 04:08

LeGEC · Answer 2 · 2017-10-12T07:42:11.750

1

If I understand correctly, you want to have the whole content of commit E applied as a single commit on top of B.

If this is what you want to achieve, you can use git checkout E . (don't forget the ".") :

# go to your `dev` branch :
git checkout dev

# get the *content* of E (here comes the ".") :
git checkout E .

# all the content of E should appear as modifications staged for commit :
git status -sb

# you can double check that the content is to your liking :
git diff --cached  # --cached means 'compare with the index'
gitk --cached      #  as opposed to 'compare with what is on the disk'

# commit
git commit

edited Oct 12 '17 at 07:42

answered Oct 12 '17 at 07:36

LeGEC

46,477
5
57
104

That's a very interesting method. I can see at least one problem though: you'll know whether the merged result works only after it's in dev, as opposed to merging (or rebasing onto) dev, testing it and then fast forwarding. Of course you don't have to commit it in dev if it doesn't work. But another problem is you can't create a merge commit, which helps keep a good history. – lfk Oct 18 '17 at 00:22

score 0 · Answer 3 · answered Oct 18 '17 at 00:32

Here's what I did: I squashed E, B and C into a new commit F, with the message for F saying it implements feature1 (note so far it also adds the patch). I then rebased onto dev. Now the commit message and history are correct: F includes the patch, but it didn't add it--it was added by a previous commit.

Finally, I merged with dev (with --no-ff to force a merge commit).

In the future, I'm going to avoid merging with dev and rebase instead.

How do I squash commits when I have merged with dev in between?

3 Answers3