1

As much as I've tried to avoid it, I have to maintain three master-equivalient branches with minor changes in each one. I've been reading about git and using it for a couple of years, so I'm familiar with the following conventional wisdom:

  1. Don't merge unless it is meaningful. Use rebase instead.
  2. If you just need to pull in one commit, cherry-pick is your friend.
  3. Always pull --rebase on the remote before pushing.
  4. Keep features in line with rebase.

What they don't tell you is the conventional wisdom creates a bunch of commits that are nearly identical, except for minor details that change their SHA1s. Why does this matter? It defeats the purpose of comparing branches with git log --left-right --graph --cherry-pick --oneline branch1..branch2 and git show-branch, which is especially annoying. Feels like an abuse of cheap branching. It makes it nearly impossible to see which patches are missing from each branch.

So, how do you keep multiple branches in line with identical SHA1s so these tools can be used? Are patches the best way to do this?

jrhorn424
  • 1,981
  • 2
  • 21
  • 27

3 Answers3

2

You cannot keep commits with identical SHAs in multiple places because the "place" of a commit (its location in history) is identified by the SHA.

A Git commit contains certain data, among them a snapshot of the current repo state, the current time and date, the committer name and email, and the SHA of the parent commit(s).

That means that when you put a commit somewhere else (rebase, cherry-pick, patch+apply) or change its time (commit --amend), the SHA will be different.

This is a significant feature of Git (Git history is a so-called Merkle tree).

As you have noticed, you can maintain (content-)equivalent commits by using the various history-rewrite commands I mentioned above.


Depending on your situation, you might be able to use the commits which are in master without having to copy them. If those changes that make your "master-equivalent" branches different are purely additive (i.e. they occur in commits which can always be appended to the latest commit in master), then you can just always rebase your other branches onto master and be happy, because then they simply contain all of master's commits exactly as they are.

If your changes are, on the other hand, of a kind which requires to be inserted earlier in history (i.e. new commits to master must be appended after your differentiating changes), then you will not be able to keep a "nice" history like that.

I can't spontaneously think of an example of the second situation, but it's theoretically possible (be it actually required by the content, or required by external factors like project specifications).

Nevik Rehnel
  • 49,633
  • 6
  • 60
  • 50
  • Thanks, this is actually the way I should have been keeping master-equivalents in line the whole time. History is clean, comparisons work as expected, DAG looks nice. For those finding this in the future, I'll just note that for me, `master` was general and `master-foo` was specific. So after testing changes in `master`, commit and then switch to `master-foo` and rebase. That plops `master-foo`'s commits on top. It's helpful also to push by doing `git rebase -i @{u} && git push origin HEAD` to avoid overwriting remote history. – jrhorn424 Oct 07 '13 at 22:41
  • Actually, I'm not sure `git rebase -i @{u} && git push origin HEAD` is the way to go. Something in my workflow is reordering my commits, and that might be it. Anyway, @nevik's advice is solid. – jrhorn424 Oct 07 '13 at 22:56
  • if you want to rebase all of the commits in `master-foo`, you dont need an interactive rebase (you can use it to check which commits will be rebased, though); and if rebasing onto `@{u}` works, `git push` will be sufficient because your branch is set up to track its upstream branch (**unless** your `push.default` setting tells Git to push *all* tracking branches) – Nevik Rehnel Oct 08 '13 at 06:30
1

I don't think it is conventional wisdom to not merge unless it is meaningful. If you want the keep history with identical SHA1 hashes, merge, and avoid cherry-picking.

tom
  • 2,335
  • 1
  • 16
  • 30
0

I think some of the premises of the question are wrong.

Git uses a content-addressable database because (among other things) it wants to protect the integrity of the version history. And every little piece of commit metadata counts towards the SHA1, hence why it changes every time.

On the other hand… Git is distributed and lets you work with distributed workflows. Like sending patches via email. [1] And patches aren’t commits (snapshots). So you end up with a two-level system:

  1. Published history
  2. In-flight changes (email patches, that patch stack that you’ve been maintaing for two years because the upstream hasn’t accepted it yet, …)

And you deal with that by (respectively):

  1. You compare branches, tags, etc. by looking at their commits; you should have the same commits, not almost-identical ones
    • A corollary is that you merge things rather than cherry-pick or rebase when working within the published history
  2. You use git cherry and other commands in order to compare commits using their patch id (basically a diff normalization; compare the changes that commits introduce)

So these are two ways of working with history (stable and in-flight, respectively)—using merges etc. and overall treating changes as immutable is far more ergonomic for stable, published history. On the other hand, treating changes as “patches” is simply necessary for in-flight history in certain workflows. But, in my opinion, using in-flight history tools on stable history will just make your life harder for very little benefit. You might save yourself from having to do a few extra merges, but in the end you can live with having those merges in your history.

Notes

  1. We could also go into rebase workflows but it’s easier to pick an example which necessitates using things like git cherry since you can’t do a git merge on patch emails
Guildenstern
  • 2,179
  • 1
  • 17
  • 39