85

When I was taught using Git with central repo (project on Gitorious) I was told to always use rebase instead of merge because we want to have linear history. So, I have always been trying to work that way.

Now when I come to think about it is it really so beneficial? Rebasing branch with many commits is much more time consuming then simple merge.

There are 2 advantages that come to my mind right now:

  1. git bisect
  2. Possibility of submitting with history to another version control system like SVN.

Are there any other benefits?

vvv444
  • 2,764
  • 1
  • 14
  • 25
Jarosław Jaryszew
  • 1,414
  • 1
  • 12
  • 13
  • 3
    See: http://stackoverflow.com/q/457927/1256452 http://blogs.atlassian.com/2013/10/git-team-workflows-merge-or-rebase/ http://blog.sourcetreeapp.com/2012/08/21/merge-or-rebase/ http://www.derekgourlay.com/archives/428 – torek Dec 03 '13 at 11:11
  • 1
    Those are the only two reasons I know of. I prefer commit bubbles, so I can see separated-out, targeted build efforts on their original branches. – Gary Fixler Dec 03 '13 at 11:48

6 Answers6

108

As far as I am concerned, keeping a linear history is mostly for asthetics. It's a very good way for a very skilled, disciplined developer to tidy up the history and make it nice to skim through. In a perfect world, we should have a perfectly linear history that makes everything crystal clear.

However, on a large team in an enterprise environment I often do not recommend artificially keeping a linear history. Keeping history linear on a team of sharp, experienced, and disciplined developers is a nice idea, but I often see this prescribed as a 'best practice' or some kind of 'must do'. I disagree with such notions, as the world is not perfect, and there are a lot of costs to keeping a linear history that people do not do a good job of disclosing. Here's the bullet list overview:

  • Rewriting history can include erasing history
  • Not everybody can even rebase
  • The benefits are often overstated

Now, let's dig into that. Warning: Long, anecdotal, mostly ranting

Rewriting history can include erasing history Here's the problem with rebasing all of your commits to keep everything nice and linear: Rebasing is not generally loss-less. There is information- real, actual things that were done by the developer - that may be compressed out of your history when you rebase. Sometimes, that's a good thing. If a developer catches their own mistake, it's nice for them to do an interactive rebase to tidy that up. Caught-and-fixed mistakes have already been handled: we don't need a history of them. But some people work with that one individual who always seems to screw up merge conflicts. I don't personally know any developers named Neal, so let's say it's a guy named Neal. Neal is working on some really tricky accounts receivable code on a feature branch. The code Neal wrote on his branch is 100% correct and works exactly the way we want it to. Neal gets his code all ready to get merged into master, only to find there are merge conflicts now. If Neal merges master into his feature branch, we have a history of what his code originally was, plus what his code looked like after resolving the merge conflicts. If Neal does a rebase, we only have his code after the rebase. If Neal makes a mistake when resolving merge conflicts, it will be a lot easier to troubleshoot the the former than it will be the latter. But worse, if Neal screws up his rebase in a sufficiently unfortunate way (maybe he did a git checkout --ours, but he forgot he had important changes in that file), we could altogether lose portions of his code forever.

I get it, I get it. His unit tests should have caught the mistake. The code reviewer should have caught the mistake. QA should have caught the mistake. He shouldn't have messed up resolving the merge conflicts in the first place. Blah, blah, don't care. Neal is retired, the CFO is pissed because our ledger is all screwed up, and telling the CFO 'according to our development philosophy this shouldn't have happened' is going to get me punched in the face.

Not everybody can even rebase, bro. Yes, I've heard: You work at some space age startup, and your IoT coffee table uses only the coolest and most modern, reactive, block-chain based recurrent neural network, and the tech stack is sick! Your lead developer was literally there when Go was invented, and everybody who works there has been a Linux kernel contributor since they were 11. I'd love to hear more, but I just don't have time with how often I'm being asked 'How do I exit git diff???'. Every time someone tries to rebase to resolve their conflicts with master, I get asked 'why does it say my file is their file', or 'WHY DO I ONLY SEE PART OF MY CHANGE', and yet most developers can handle merging master into their branch without incident. Maybe that shouldn't be the case, but it is. When you have junior devs and interns, busy people, and people who didn't find out what source control is until they had already been a programmer for 35 years on your team, it takes a lot of work to keep the history pristine.

The benefits are often overstated. We've all been on that one project where you do git log --graph --pretty and suddenly your terminal has been taken over by rainbow spaghetti. But history is not hard to read because it's non-linear...It's hard to read because it's sloppy. A sloppy linear history where every commit message is "." is not going to be easier to read than a relatively clean non-linear history with thoughtful commit messages. Having a non-linear history does not mean you have to have branches being merged back and forth with each other several times before reaching master. It does not mean that your branches have to live for 6 months. An occasional branch on your history graph is not the end of the world.

I also don't think doing a git bisect is that much more difficult with non-linear history. A skilled developer should be able to think of plenty of ways to get the job done. Here's one article I like with a decent example of one way to do it. https://blog.smart.ly/2015/02/03/git-bisect-debugging-with-feature-branches/

tldr; I'm not saying rebases and linear history aren't great. I'm just saying you need to understand what you're signing up for and make an informed decision about whether or not it's right for your team. A perfectly linear history is not a necessity, and it certainly isn't free. It can definitely make life great in the right circumstances, but it will not make sense for everyone.

Dogs
  • 2,883
  • 1
  • 19
  • 15
  • 9
    I've always thought the same but never been able to articulate so well like you did. Thank you. Will share this with whoever tell me `git rebase` like it's the best thing to do – ericn May 31 '19 at 05:35
  • 6
    To add to your answer, if you have enabled a checkpoint such as no merge into the base branch without a minimum set number of approvals, each time you have new changes on master and you rebase your topic branch on `master`, your VCS will forget about the approval, since the history has been rewritten. Eventually, if there is a delay between `Approve` and `Merge` the developer will have to bug the approver once again, unnecessarily. – Anuj Kumar Jul 01 '19 at 06:15
  • 5
    This answer is long and includes anecdotal/straw-man arguments. Not saying it's wrong, but it would be nice if we could slim it down and present the actual problems solved. Like, fx. merge-commits can make it difficult to split a branch into 2, if you find out you only want to release half of the work. – Vargr Mar 03 '20 at 13:16
  • 2
    Pfft.. I was there 30 years before Go was invented... To that developer, I say "Neal Before Me" :) – David V. Corbin Jun 09 '21 at 14:13
  • @Alper on some teams, yes, that is the reality we have to live with. If your team can handle it, great. Do linear history. But git does a really good job of handling non-linear history and you're not doing anything wrong if you use it that way. – Dogs Nov 04 '21 at 12:28
  • damn my dude. you just totally destructured and destroyed my mindset and workflow I worked on really hard to put in place in multiple teams of ~ 5 developers for the last few years. Although we had almost no problems even with juniors thanks to strong guidelines, your arguments are definitely heavy, thanks a lot for this great answer ! _also had a good laugh reading through it :D_ – Pierre C. Dec 01 '21 at 17:08
  • If you say that linear history is mostly for aesthetics, then I say that non-linear history is done by dilettantes and amateurs. Non-linear history is a mess. Time is linear, so history should be linear. Amen. – Kornel Szymkiewicz Dec 08 '21 at 22:19
  • I would add 1 more point to this excellent list: if you're looking at code history, then things have gone wrong, and the information loss of linear history is much more likely to be a problem than the difficulty following spaghetti merges. In the best case, we aren't looking at code history. – Paul Keister May 09 '23 at 17:10
36

A linear Git history (preferably consisting of logical steps), has many advantages. Apart from the two things already mentioned, there is also value in:

  1. Documentation for the posterity. A linear history is typically easier to follow. This is similar to how you want your code to be well structured and documented: whenever someone needs to deal with it later (code or history) it is very valuable to be able to quickly understand what is going on.
  2. Improving code review efficiency and effectiveness. If a topic branch is divided into linear, logical steps, it is much easier to review it compared to reviewing a convoluted history or a squashed change-monolith (which can be overwhelming).
  3. When you need to modify the history at a later time. For instance when reverting or cherry-picking a feature in whole or in part.
  4. Scalability. Unless you strive to keep your history linear when your team grows larger (e.g. hundreds of contributors), your history can become very bloated with cross branch merges, and it can be hard for all the contributors to keep track of what is going on.

In general, I think that the less linear your history is, the less valuable it is.

m-bitsnbites
  • 994
  • 7
  • 19
  • 17
    Except that in many practical applications, a linear code history is almost impossible to manage. As soon as many different people work together and you need to have more than just one or two branches things get more complicated and confusing if you try to force a linear history. IMHO, git is especially strong because of its branching/merging features. – mattmilten Oct 10 '16 at 09:07
  • 5
    I think that the advantages mentioned above still hold. Whether they outweigh the disadvantages is another question. A linear history often means (slightly) more work, but on the other hand, so do unit tests, linting, code comments and documentation. It's really a question of what quality requirements you need for your project. – m-bitsnbites Nov 02 '16 at 07:19
  • Linear history does not detract from the merits of simply creating branches. Workers branches before merging with the main branch can and should be changed, and this can be done simply due to the potential of Git. – ilyar Nov 29 '18 at 10:35
  • Besides the current moment, the main providers of repositories have the tools to simply maintain a linear or semi-linear history. Bitbucket - When to do a fast-forward merge GitHub - Allow rebase merging GitLab - Merge commit with semi-linear history - Fast-forward merge – ilyar Nov 29 '18 at 10:35
  • 3
    I agree with the listed advantages. I’ll especially note 5 point, it is very easy to use **git revert** and **git cherry-pick** in a linear history. – ilyar Nov 29 '18 at 10:49
  • At least 4,5 and maybe 3 are true of non-linear history. – D. Ben Knoble Aug 13 '19 at 12:36
  • 2
    When you have a team of 10+ teams of 2-4 devs each working on the same project (feature-based/trunk-develop style), none of these things are a benefit. Linear histories are (in my limited experience) a waste of time. You rarely/never need to go back and use the "benefits" of it. "Extraneous commits" -> people are unaware you can rename "merge" commit's message to something more useful, right? – Martin Marconcini May 18 '21 at 10:25
9

If you're rebasing your work often and nobody else is working in that part of your code, it should usually be a non-event.

These are the commands more or less (from):

git checkout -b my-new-feature
git push -u origin my-new-feature

# Changes and commits

git rebase origin/master
git push origin my-new-feature --force-with-lease
git merge --no-ff my-new-feature

Also people here seem to be mistaking merges and merge commits. I'm in favour of a linear history with merge commits, like this. That way you can see the individual commits if you need to but can also jump from merge to merge.

enter image description here

Alper
  • 3,424
  • 4
  • 39
  • 45
5

Here is a pro/con list of rebase or squashing commits when merging.

Pro:

  • Easy to read chronological history
  • No extraneous commits in history
  • Splitting branches/PR's into smaller chunks for easier review, and testing is easier without merge commits
  • Being able to rebase feature branches easily (Fx. to change merge order)
  • It encourages meaningful commit messages (but does not enforce it)
  • When doing a release, one can more easily look at the history and use it to write a change-log, for use by testers, consumers etc.
  • The commit by commit rebase encourages PR's with less/smaller commits

Con:

  • It's more complicated, less people have experience with it
  • Can lead to problems if used on remote branches without care. Examples, lost commits, inconsistent history across local environments
  • It's an extra thing to maintain (Does x project really need this?)
  • More time consuming if you rebase a branch with many conflicting commits

Personal note

I mentioned squashing before merge as well because it has a similar effect in terms of history.

Also, most github and other services have options to rebase-merge and squash-merge, right there in the GUI.

Making it practically a free action for PR's in some cases.

Vargr
  • 1,558
  • 1
  • 11
  • 10
  • 3
    "if you rebase a branch with many conflicting commits" - if you know/see it's too much conflicts do rebase abort, squash commits before rebase and rebase a single commit - this way the total number of conflicts = number of just-merge-conflicts – Vlad Bokov Jun 08 '21 at 22:12
  • The idea is the conflict resolution is compacted into one commit instead of multiple, it is not omitted. :) – Vargr Aug 13 '21 at 07:45
4

With linear history, you can easily track history for a single file across renames with git log --follow. Quoting the documentation on the log.follow config option:

If true, git log will act as if the --follow option was used when a single <path> is given. This has the same limitations as --follow, i.e. it cannot be used to follow multiple files and does not work well on non-linear history.

Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
  • Since renamed files is a pretty interface thing and not really core to git (that is, git doesnt track the underlying file in any way, really), relying on linear history for rename-tracking is a waste of the power of git. But it is one advantage, I suppose. – D. Ben Knoble Aug 13 '19 at 12:24
1

Two more advantages of linear/semi-linear history that weren't mentioned in the other answers -

  1. No code changes are hidden inside merge commits as conflict resolution. Each diff belongs to single commit with its message explaining the reason behind the change.
  2. Conflicts are conceptually easier to resolve during rebase than during merge, as the resolution takes place in a "context" of single commit (as opposed to entire branch). This reduces potential conflict resolution mistakes.

Let me explain this by example. Assume the following non-linear history:

*   f17ba26 (HEAD -> master) Merge branch 'topic/feature'
|\
| * 1234bbb (topic/feature) adds feature foo
* | 1234aaa blah
|/
* 03f4f8d previous commit

Suppose evolution of some code line that was changed by both commits 1234aaa and 1234bbb and the resulting conflict was resolved by the merge commit:

Commit   Line contents
03f4f8d  print("Nothing is supported")
1234aaa  print("We now support blah!")
1234bbb  print("We now support foo!")
f17ba26  print("We support plenty of futures now!")

In such scenario, the logic of why this line assumes its final state is hidden under the conflict resolution in the merge commit. The merge commit might make it very difficult to follow the logic of why the final version of code was chosen. Also, amount of conflicts in the merge commit might be large, making it even more difficult to grasp and review.

If the same code evolved through rebase, and the history was semi-linear like this:

*   f17ba26 (HEAD -> master) Merge branch 'topic/feature'
|\
| * 1234bbb (topic/feature) adds feature foo
|/
* 1234aaa blah
* 03f4f8d previous commit

The conflict would have been resolved as part of the rebase, in a context of applying the 1234bbb commit. The developer resolving the conflict would have had the opportunity to revise the specific change given the already existing change from 1234aaa and document the reasoning his/her decision about the final code in the commit message.

vvv444
  • 2,764
  • 1
  • 14
  • 25