git merge multiple commits into one in an orphan branch each commit in a prefix subdirectory

Question

I need a merge more than 1 commit each from a branch or a remote repo into a single commit in another branch.

input branch#1: o--o- - -o   (C1)
                          \
input branch#2: o--o- - -o | (C2)
    :                     \|
input branch#N: o--o- - -o | (Cn)
                          \|
 output branch:   o--o- - -o (Cm)

I need to do it in a special way where the source tree of each input branch merge commit is a prefix or subdirectory in the source tree of the output branch merge commit:

<C1>       <C2>       ...  <Cn>
 |          |               |
 +- c1.txt  +- c2.txt       +- cn.txt

<Cm>
 |
 +- C1/c1.txt
 |
 +- C2/c2.txt
 |
 :     :
 |
 +- Cn/cn.txt

Additionally, I need to change some parameters of the merge commit, like author date, author email, etc and generate a commit message from commit messages of all input branches leave the parents of a merge commit as is without any changes (including parent commit hash list in a merge commit).

Digging in the internet I have already found the most universal solution with the minimal set of commands:

git merge --allow-unrelated-histories --no-edit --no-commit -s ours <input-branches-and-commits>
git read-tree --prefix=C1/ <C1-branch>
git read-tree --prefix=C2/ <C2-branch>
:
git read-tree --prefix=Cn/ <Cn-branch>
cat ... | git commit --no-edit --allow-empty --author="..." --date="..." -F -

But it does work differently when the output branch is an orphan branch. In that case the content of an input branch merges additionally into the root of the source tree of the output branch commit:

<Cm>
 |
 +- C1/c1.txt
 |
 +- c1.txt

Basically it happens when the input branch is the only input branch (I didn't test the case with the multiple input branches when the output branch is an orphan branch because I didn't have that case yet, but I don't exclude that).

I have found the reason why that happens. Because the head does not exist yet and can not exist including the output branch then the merge command creates it upon the call and in the same time leaves the merge incomplete with the output branch pointing to an input branch which actually makes the output branch the parent to itself. This brings the content of the source tree of an input branch into the root of the source tree of the output branch commit without a notice from the user.

I know at least one approach to avoid that behavior, for example, create an empty commit in the output branch before the merge which makes the orphan branch not orphan and initializes the head together with the reference to the output branch.

But I don't want that to do because I have to somehow remove that commit later which is actually workaround code to the git.

Does out there exist a good known way to deal with the git guts to make all things work and merge together as expected?

torek · Accepted Answer · 2020-01-15T21:23:54.670

If you're going to use git read-tree to fill the index for the commit you're building—and yes, this is the easy way to add a prefix to each, just as you are doing—you are already deep in the innards of Git, so you might as well use git commit-tree to build the commit object.

In other words, don't start with git merge at all. Just empty out the index with git read-tree --empty. Then read each commit C_i, 1 ≤ i ≤ n. Your index now contains the files you intend to put into this merge commit C_m.

Then, instead of git commit, use git write-tree to turn the index into a tree object, followed by git commit-tree to embed the tree object in a new commit. Since git commit-tree allows you to specify each parent, you can make your N-way octopus merge directly:

git read-tree --empty
git read-tree --prefix prefix1 C1
git read-tree --prefix prefix2 C2
...
git read-tree --prefix prefixn Cn

tree=$(git write-tree) || die ...
commit=$(cat ... | git commit-tree -p C1 -p C2 -p C3 ... -p Cn) || die ...

Last, attach a new branch name to the resulting commit:

git branch the-final-result $commit

and you have your commit C_m on this new branch.

Edit: apparently I misread the question a bit, and you already also have one existing branch name B whose tip commit is currently commit C_B. You should read this tree initially, instead of using git read-tree --empty, if you want to preserve its files, and then use that commit as one of the parents in the final git commit-tree and simply fast-forward that new commit to the existing branch name B. So:

git read-tree Cm
git read-tree --prefix prefix1 C1
  .
  .
  .
git read-tree --prefix prefixn Cn

tree=$(git write-tree) || die ...
commit=$(cat ... | git commit-tree -p Cm -p C1 -p C2 ... -p Cn) || die ...
git push . $commit:refs/heads/B  # or git branch -f B $commit

Adjust per actual desired result.

This approach has at least one problem, the `commit-tree` should contain `-p Cm-1` commit hash, otherwise there will not be any relation to the `Cm` branch and the push would be rejected. Which means that the `read-tree Cm-1` must be applied too if the `Cm` branch has other commits except the merged ones. — Andry, Jan 15 '20 at 18:33
Ah: I let your question title override your question body. You asked for the result to be an orphan branch, and the only sensible meaning I could get out of that is that the new commit is pointed-to by a new branch name, not related to any of the input commit branch names. But what you actually wanted was for the new commit to be the new tip of one branch that *is* one of the C-sub-i's. — torek, Jan 15 '20 at 21:16
(1) I wanted a merge of `C1..Cn` which does not work as expected *in case* of orphan branch, but it does not mean the `Cm` branch *is orphan* at any point of merge. (2) `Cm` branch is not specifically merged *completely* from `C1..Cn`, it still can contain commits not from `C1..Cn`, I just didn't mention this because has used the `merge` command which does take that into account. — Andry, Jan 16 '20 at 06:33
One more important clarification is that the `git read-tree` can not merge 2 or more source trees into single directory or subdirectory so `C1..Cn` and `Cm` should not intersect between each other by a source tree, otherwise only the last `git read-tree` would be applied. — Andry, Jan 16 '20 at 06:52
@Andry: right, that's why independent `--prefix=` settings are required here. You *can* merge trees that overlap, but you need to use different `read-tree` arguments for this case, and cannot use `--prefix`. — torek, Jan 16 '20 at 06:57

git merge multiple commits into one in an orphan branch each commit in a prefix subdirectory

1 Answers1