2

Given a commit with notes attached, can I take the message in the note and merge it into the commit's message when I do a rebase?

The background of this question is that I have a large repository imported via git-tfs that has a huge amount of notes linked to individual commits and while I am cleaning up the repository I would like to resolve the notes and merge them into the corresponding commit's messages.

lanoxx
  • 12,249
  • 13
  • 87
  • 142

1 Answers1

2

Not currently, no. There's no fundamental reason it could not be done, but the support for notes-on-commits is still kind of primitive. Rebase can now copy a note when it copies a commit, but it does not offer the ability to squash notes when using squash.

If you want to implement this yourself, you should start with how git notes actually work, which I'll describe here rather briefly. Git stores an auxiliary commit under the ref name refs/notes/commits. This commit contains a series of files with weird-looking names, such as:

5d/01/301f2b865aa8dba1654d3f447ce9d21db0b5

for example. If you take the slashes out of the file name, the result is a hash ID. The contents of this file are the note for commit 5d01301f2b865aa8dba1654d3f447ce9d21db0b5.

To update notes, or add new notes, Git will:

  • extract the notes commit to a temporary area;
  • create or update the file corresponding to the desired commit;
  • use the resulting set of files to make a new commit, whose parent is the current refs/notes/commits commit; and
  • update refs/notes/commits to store the hash ID of the new commit.

In effect, this is a lot like checking out the refs/notes/commits commit as a branch, except that you can't do that since refs/notes/commits is not a branch name. (Also, Git uses a bunch of shortcuts here to avoid populating a full working tree.)

Hence, to merge the notes of commits C1 and C2 that have been squashed into new commit C3, you would:

  • memorize the three hash IDs;
  • check out or otherwise inspect the desired notes commit—this might be the previous one at this point, depending on where you do all this work—to find out how many layers of xx/xx/xx/xx to use (this varies: the notes code adds another layer whenever the tree gets "too crowded" at any given level; the actual level is implicit in the file names, but if you never split or join the names you don't have to worry about this);
  • locate the files corresponding to C1 and C2, and combine them with whatever algorithm you like (probably union-merge);
  • read the notes for C3 if needed (if you were working on C3's parent above) into a temporary index, or make sure you have C3's notes in this temporary index in any case;
  • compute the number of layers to use for a C3 note, and create or update the appropriate file in the temporary index; and
  • use the temporary index to make a new commit whose parent is the refs/notes/commit commit, then update refs/notes/commit to store the new commit's hash ID.

You've now updated the notes for commit C3, so that when git log goes to show commit C3, if it's also showing notes, it will read the for-C3 file in your new commit.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thank you for this extensive explanation. Do you think it could be possible to automate these steps during a `git filter-repo` run to identify each commit that has a note attached and then fold that node into the respective commit? – lanoxx Mar 14 '22 at 12:45
  • @lanoxx: Given that filter-repo is written in Python and you therefore have the source code, it's just a [Small Matter Of Programming](https://en.wikipedia.org/wiki/Small_matter_of_programming). Seriously, how hard that would be to do, I have no idea. It doesn't sound *trivial* but Python would give you nice data structures (dictionaries and lists). – torek Mar 15 '22 at 08:18