I am trying to construct a git log
command that retrieves a file's history (following copies & renames) with the following properties:
I want the log to be "closed under" annotation-following -- that is, if I
git blame -elfwM
the file any commit in the history (with whatever name the file had at that time), I want every commit that appears in the annotations to also be included in the history.I want to know the original name of the file for every entry in the history. For a commit that did not rename the file this will be the same as the filename; for a commit that did, I want to know the original filename in each parent of that commit.
For any given annotation of the file (at any point in its history), I want the corresponding history entry for that commit (which we know exists from property 1) to have its author, date, and filename match the author, date, and filename of the annotation.
Satisfying (1), I want as few additional commits as possible. I especially want to exclude commits that did not affect the file in any way.
So far, the best I have been able to do is: git log --raw --follow -m --pretty="%H%n%P%n%aL%n%cs%n%s" -- FILENAME
. Here's my thinking behind this line:
--follow
should do the job of following renames & copies (but does not give me the filenames)--pretty=...
should give me commit, parents, author, commit date, and subject. I am guessing that original author + commit date are whatgit blame
uses, but if that's wrong please correct me.--raw
should give me original filename and new filename for a given commit.-m
should split out entries for merge commits so that I can get the original name for individual parents.
This seems to work okay in typical cases but I've written a script that demonstrates a scenario where this fails. Here's an example output from one of its runs:
Created git repository at /var/folders/y4/2t2n3dhj4bz4cwsrm801t_bm0000gn/T/tmp.Rtj55RWb
Committed: cbc8198fd5eb975ab5fc1fcc66889872429a40fe (master) Initial commit
Committed: 5628acbb478a8786eaec186bf4e6050142049848 (workbench) Renamed foo.txt to bar.txt
Committed: 2f6d49b3aa35ffa2953a65e21ba5c21d130fa3b1 (workbench) Modified line 3 of bar.txt.
Committed: bac0a739a1fd2acc7c0ce466d9055942cbf87ccb (workbench) Added dummy.txt
Committed: c018ca1b4436b73237c9a727ed2353cbd8152928 (workbench) Removed dummy.txt
Committed: 18bec91206357ec23ffe53a01d20ec64f7667e4e (master) Renamed foo.txt to baz.txt
Committed: a724f0cc49f39d9e99b8794b9f263efb5bc51da1 (master) Modified line 8 of baz.txt.
Committed: 4628679d5695c9a5fb080124b854f336fdf683d1 (master) Added dummy.txt
Committed: 58f700c48a16496dfd540126dab5e55952847993 (master) Removed dummy.txt
Committed: 633056c27cf0e3afb9529d478cf51ed0bdaa918e (master) Merged bar.txt and baz.txt as foo-merged.txt
----------------------------------------
633056c27cf0e3afb9529d478cf51ed0bdaa918e
58f700c48a16496dfd540126dab5e55952847993 c018ca1b4436b73237c9a727ed2353cbd8152928
john.doe
2021-04-10
Merged bar.txt and baz.txt as foo-merged.txt
:100644 100644 27393f0 bb5a6e5 R089 baz.txt foo-merged.txt
a724f0cc49f39d9e99b8794b9f263efb5bc51da1
18bec91206357ec23ffe53a01d20ec64f7667e4e
john.doe
2021-04-10
Modified line 8 of baz.txt.
:100644 100644 4f9956e 27393f0 M baz.txt
18bec91206357ec23ffe53a01d20ec64f7667e4e
cbc8198fd5eb975ab5fc1fcc66889872429a40fe
john.doe
2021-04-10
Renamed foo.txt to baz.txt
:100644 100644 4f9956e 4f9956e R100 foo.txt baz.txt
5628acbb478a8786eaec186bf4e6050142049848
cbc8198fd5eb975ab5fc1fcc66889872429a40fe
john.doe
2021-04-10
Renamed foo.txt to bar.txt
:100644 000000 4f9956e 0000000 D foo.txt
cbc8198fd5eb975ab5fc1fcc66889872429a40fe
john.doe
2021-04-10
Initial commit
:000000 100644 0000000 4f9956e A foo.txt
----------------------------------------
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 1) This is line 1
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 2) This is line 2
2f6d49b3aa35ffa2953a65e21ba5c21d130fa3b1 bar.txt (<john.doe@gmail.com> 2021-04-10 3) Modified bar
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 4) This is line 4
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 5) This is line 5
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 6) This is line 6
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 7) This is line 7
a724f0cc49f39d9e99b8794b9f263efb5bc51da1 baz.txt (<john.doe@gmail.com> 2021-04-10 8) Modified baz
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 9) This is line 9
cbc8198fd5eb975ab5fc1fcc66889872429a40fe foo.txt (<john.doe@gmail.com> 2021-04-10 10) This is line 10
The third line shows bar.txt
was modified in 2f6d49b3aa35ffa2953a65e21ba5c21d130fa3b1. Unfortunately, this commit does not appear in the history - I would have expected it to have appeared due to --follow
, and I also would have expected a second entry for 633056c27cf0e3afb9529d478cf51ed0bdaa918e due to -m
because it has two parents (and I would have expected that entry to have a raw line like :100644 100644 27393f0 bb5a6e5 R089 bar.txt foo-merged.txt
).
Note: If fetching this information is achievable but not in a single git log
command, a solution that uses a constant number of commands would also work. I want to avoid doing things like recursively examining annotations to derive the history, and that particular strategy would also fail if there's a commit in which every line changed, e.g., due to reformatting.