1

Suppose I want to create a brand-new branch that's an exact copy of an existing one. (If it matters, later I'll explain my particular motivation.)

The first time I needed to do this, I did it "by hand":

git checkout oldbranch
git checkout -b newbranch

Later I discovered the "official" way:

git branch -c oldbranch newbranch

But I just discovered a pretty serious problem: using branch -c, and if oldbranch was a remote-tracking branch, newbranch is not really a brand-new branch, in that it inherits oldbranch's upstream. This almost caused me some serious problems just now, and I spent a fair amount of time trying to untangle it.

So what's the right way to do this? Should I be using my original, "by hand" method? Or should I remember, any time I use branch -c, to then use git branch --unset-upstream (I think that's right) to remove the upstream tracking?

The context this comes up in is that I had to rebase a branch, but I didn't really want to rebase the branch; instead I wanted to rebase a copy of the branch. I wanted to keep the old, un-rebased branch around, partly on general (i.e. packratty) principles, partly because the un-rebased branch already had an upstream, which I obviously didn't want to upset.

When I went to push the rebased copy to my upstream, I expected git to complain that there was no upstream branch, and to remind me to do the --set-upstream thing. But instead it complained that things were out of sync, and that's when I discovered that the new copy retained the original branch's upstream.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103

3 Answers3

1

TL;DR

I suspect you may want to set branch.autoSetupMerge to false, i.e., git config --global branch.autoSetupMerge false. (The connection here is obvious, no? ) At least, though, you want to stop using --copy aka -c.

Long

Suppose I want to create a brand-new branch that's an exact copy of an existing one. (If it matters, later I'll explain my particular motivation.)

The motivation does matter somewhat, but perhaps not as much as you think:

using git branch -c, [when] oldbranch [is] a remote-tracking branch, newbranch is not really a brand-new branch, in that it inherits oldbranch's upstream ...

This isn't quite right. Instead, all the branch creation methods can "set an upstream" on the new branch. Whether and when they do set an upstream depends on numerous options, which makes describing this tricky. In this particular case, when oldbranch is a remote-tracking name (my term for it: see below), the default is that oldbranch becomes the upstream. That is, we don't find oldbranch's upstream—it does not have one; only branches have upstreams—but instead it is the upstream.

To properly explain this, let me start with why I hate the official Git name for names like origin/main. Git calls these remote-tracking branch names. This poor word branch now has about 6 different meanings (before breakfast?), one of them being "remote-tracking name". But we don't need it to mean that at all. If we just use the phrase remote-tracking name, omitting the word branch entirely, we have a noun phrase for names like origin/main that's not ambiguous and doesn't call up false associations.

"What false associations?" you might wonder, and here's where we get into the essential difference between a branch name and a remote-tracking name. Both locate one specific commit. Both are useful for finding multiple commits "as seen on some branch" (somewhere, in some Git repository, perhaps using the word branch in yet another way), although a raw commit hash is just as good for that in most senses. (The one sense in which it isn't "just as good" is ... well, try typing in raw commit hashes all day. They're just hard for humans to get right. I cut and paste.) But:

  • You can "get on a branch". With git switch or git checkout, giving them a branch name will—if the operation is successful—put you "on the branch". In this attached-HEAD mode, making a new commit will stuff the new commit's hash ID into the branch name.

  • A branch can have an upstream set. The upstream is actually a two-part entity, consisting of a remote name like origin, and a branch name as seen on that remote, such as refs/heads/main. Fortunately git branch --set-upstream-to=origin/branch branch and git rev-parse branch@{upstream} let us ignore this two-part business, which largely dates back to the time when "remotes" were first being invented.

  • A branch can have "rebase mode" set for git pull. That is, when on this particular branch, git pull means git pull --rebase. (This is separate from the global setting.)

  • A branch name lives in refs/heads/: that is, the full name of main is refs/heads/main. A remote-tracking name lives in the refs/remotes/ namespace.

All of these do show up at various times, with varying frequency. In particular git switch requires --detach when used with a remote-tracking name; git checkout implies --detach when used with a remote-tracking name; in both cases this puts us in "detached HEAD" mode, so that we're on no branch at all.

Branch creation options

Creating a new branch, in Git, really consists of two steps (assuming we've already determined that the name is valid and not in use), but there are third and fourth optional steps:

  1. First, we must locate some commit. We need its raw hash ID. Any valid, existing hash ID will do if it's a commit hash ID: tree, tag, and blob hash IDs are forbidden.

  2. Then we just need to create a new ref whose spelling is refs/heads/name.

  3. Optional: We may request that Git set the upstream of this new branch to some name. That name can be a branch name or a remote-tracking name.

  4. Optional: We can even copy more items.

The --track or -t option, given to git branch, git checkout -b, or git switch -c, tells Git that it should definitely do step 3. This requires that we also supply a starting point (though it's not an argument passed to the -t option); the starting point provides the hash ID for step 1 and the name for step 3.

(Alas, this gets more complicated starting with Git version 2.35. Since I'm working through history, let's start with the much older history before we add the new thing.)

The --no-track option, given to any of these commands, tells Git that it should definitely not do step 3. We can now provide a starting-point, safe in the knowledge that step 3 won't happen.

If we use neither --track nor --no-track, the default is that Git will do step three if and only if (a) we provide a starting point and (b) the starting point we provide is a remote-tracking name.

Using git config, however, we can alter two Git settings: branch.autoSetupMerge and/or branch.autoSetupRebase. With branch.autoSetupMerge set to always, step 3 will happen even if we use a local branch name. That is, step 3 is avoided only if we use a raw hash ID or something else unsuitable (or, of course, use an explicit --no-track). Or, we can set it to false: then step 3 never happens. The default (which we can also set) is true, which selects the "if it's a remote-tracking name" mode.

Once we've set branch.autoSetupMerge as desired, we can set branch.autoSetupRebase. This sets whether git pull should mean git pull --rebase, and as before, it has multiple modes: never, local, remote, and always; see the git config documentation for further details. (The more interesting thing for me is how this interacts with the new pull.ff setting, if it's set to something other than the default never.)

Once you've digested all of this, it's worth mentioning that git switch -t has another function. Suppose you have a remote, such as origin, and it has produced a slew of remote-tracking names in your repository. The git switch and git checkout commands have a --guess option (default = on, including whenever your Git is old enough to lack this as a separate option). With this option enabled, git checkout name or git switch name will, by default, first check to see whether name exists, and if so attempt to switch to it. But if not, before complaining that there is no such branch name, the command will search through your remote-tracking names. If there's exactly one "obvious match"—for instance, if you asked to switch to the nonexistent branch dev and there's one origin/dev—then --guess means create dev from origin/dev. The usual "tracking" (set or don't set an upstream) rules apply, per branch.autoSetupMerge.

But if you have two remotes—say, gh1 and gh2 for two different but related GitHub repositories—you might have a gh1/dev and a gh2/dev both. Then git switch --guess dev doesn't know which one to use. Using git switch -t gh1/dev will create your dev from your gh1/dev (your Git's memory of gh1's dev). Of course, the upstream-setup is forced on here; git switch --no-track gh1/dev will pull the same trick but force upstream-setting off.

Before we go on, let's make a few last observations:

  • The extra argument to git branch or git checkout -b or git switch -c, e.g., git branch newbr startpoint, provides the initial hash ID to put in the new branch name. That is, startpoint is parsed, as if by git rev-parse, for its hash ID. But it's also parsed to see if it's a branch or remote-tracking name for the branch.autoSetupMerge purposes.

    If we give Git the string startpoint^{} or startpoint^{commit}, the resulting hash ID is that of the same commit we'd get by default, but the string no longer matches a branch or remote-tracking name, because of the suffix. So this automatically defeats the autoSetupMerge setting. It can be used as a one-off.

  • Besides the upstream setting, a branch name can have the rebase setting, so there are actually four steps to creating a new branch, with two of them optional (optionally set an upstream, and optionally set the rebase flag).

  • Besides the upstream setting, a branch name has a reflog. The reflog contains a history of hash IDs that were stored in the branch name. (Use git reflog main or git reflog master to dump the reflog for your main or master branch, to see these.) The "zeroth" entry is the current value.

    Reflogs can be disabled (though you still have an automatic @{0}), but are on by default in non-bare repositories. So you probably have reflogs for all your branch names. Reflogs also exist for HEAD itself, and you can have a reflog for every reference. The core.logAllRefUpdates setting is what controls whether new reflogs are created as needed; see the git config documentation.

    Besides the upstream and reflog, every branch can have arbitrary additional settings. There aren't any in Git now but there could be in the future. For instance, you can run git config branch.main.abc def to set branch.main.abc = def: it doesn't mean anything, but you can set it.

    The -c option to git branch is the copy flag. It also tells git branch that you're creating a new branch, of course, as it makes no sense to copy things. But "create new branch" is the default action for git branch, if some other action isn't set. Adding -c or --copy means copy the reflog and all other settings (even the ones Git doesn't know about!). This will copy the upstream setting when "copying" from a local branch, since it's, well, a setting.

Now we can also describe the new --track flags in Git 2.35: --track=direct and --track=inherit. The -t option means --track=direct. When branch.autoSetupMerge has its default value, we only get an upstream set by default when we create a new branch using a remote-tracking name. The remote-tracking name itself is the new branch's upstream. But if we set branch.autoSetupMerge to always, we'll get an upstream set with git branch newbr foo as well as with git branch newbr origin/foo. Some people disliked the fact that the upstream for newbr is now the (local) branch foo. They wanted git branch to read foo's upstream, and set newbr's upstream to foo's upstream.

This is what git branch --track=inherit does. You must spell out --track=inherit exactly this way. Note that this is also what git branch --copy (aka git branch -c) does; it's just that -c does a bunch more stuff along the way (copying reflogs plus all settings).

Rebase-and-keep

The context this comes up in is that I had to rebase a branch, but I didn't really want to rebase the branch; instead I wanted to rebase a copy of the branch. I wanted to keep the old, un-rebased branch around, partly on general (i.e. packratty) principles, partly because the un-rebased branch already had an upstream, which I obviously didn't want to upset.

I do this a lot myself. In general, though, I keep only the current version upstream (or no version upstreamed), with all the old versions just in my own repository:

git switch somebranch        # get on it before rebasing
git branch -m somebranch.0   # rename it to somebranch.0
git switch -c somebranch     # make the new one using HEAD, no upstream
git rebase ...

Since I always use the local name (and git checkout -b or git switch -c) I never wind up with an upstream set, even with the default settings. The next time I rebase, I rename somebranch to somebranch.1, and so on.

When I went to push the rebased copy to my upstream, I expected git to complain that there was no upstream branch ...

As a nice side effect, when I rename branches like this, any existing upstream setting sticks to the old (but now renamed to .0, .1, etc) branch, which for me means I can't git push it because I have push.default set to simple: the name no longer matches on both sides. Since I create the new branch from the existing branch, it has no upstream set and I can't git push it either.

I could just rely on the reflogs: if I did not rename anything at all, the reflog for somebranch would have in it the values that wind up in somebranch.0, somebranch.1, and so on. But reflog entries reflect something automatic, rather than some deliberate decision I made. If I'm making substantive changes, I may choose a new name for the branch in the first place.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Wow. A lot to digest there. Thanks. But once again I am stunned at how a seemingly-simple question can have such a complicated answer. And we may have had a miscommunication: When I said "if `oldbranch` was a remote-tracking branch", I guess I meant "if `oldbranch` *has* a remote-tracking branch". So when you said "when `oldbranch` is a *remote-tracking name*, the *default* is that `oldbranch` becomes the upstream", were you referring to what I said, or what I meant? Because I thought I meant that `oldbranch` was a true (local) branch – Steve Summit Jul 09 '22 at 11:40
  • Aha! I never use `git branch -c` (aka `--copy`) and the documentation fails to mention that it *also copies the other settings* (not just the reflog). This is definitely a documentation bug. – torek Jul 09 '22 at 16:49
  • ... and, the bug was fixed (or at least addressed) recently, and I missed the update: the documentation now says *Copy a branch, together with its **config and** reflog* (emphasis mine). – torek Jul 09 '22 at 17:10
0

So what's the right way to do this?

The right way is

git checkout -b newbranch oldbranch

or

git switch -c newbranch oldbranch

The -c option to git branch is not the official way to create a new branch that starts with the same "content state" as an existing branch - its intent is to copy more metadata about the branch.

Instead of git checkout -b or git switch -c you can also use git branch, but without the -c option:

git branch newbranch oldbranch

The main reason to use checkout or switch instead, is that they land you in the new branch, which is typically what you want.

(off-topic maybe, I'm curious: did you mean to use git switch -c the whole time, when you mentioned the official way?)

Tao
  • 13,457
  • 7
  • 65
  • 76
0

Suppose I want to create a brand-new branch that's an exact copy of an existing one.

But you don't!

If you fix this misunderstanding first, everything becomes much clearer.

What you're trying to do isn't called copying in the first place. It's called branching. A branch created from another branch (or any arbitrary commit) is just a pointer to that commit. The new branch starts out with exactly the same history as the old one, which is all you seem to want, but they can then diverge independently.

A branch is: a ref name, the ref to a particular commit, optionally an upstream, and maybe some other branch-specific config. If you don't want to copy all of that, then copying a branch is not what you're trying to do.

Just to be super-clear, the manual says:

-c, --copy
    Copy a branch, together with its config and reflog.

where config means everything related to that branch in .git/config, such as:

[branch "oldbranch"]
        remote = origin
        merge = refs/heads/oldbranch

So what do you actually want? Obviously not a literal copy including an exact copy of the upstream.

Just run

git checkout -b newbranch oldbranch
Useless
  • 64,155
  • 6
  • 88
  • 132