1

The set-up is like this. We had a number of pushes to the repository with a commit messages that looked like this:

[System] Updated the logs
[System] CSV file updated
etc.

Now I would like those commits to be tied to a specific git account so I could distinguish the changes from my commits. The "System" has been committing with the author name and the emails just like mine.

Now I would like to search for the commit messages and for the every commit that has [System] in the message, I would like to update the name to the System. How I can do that?

and I would like it to be in a form of a console command (where I can just run a command, without using -interactive option to manually change it) (using git of bash or python, whatever solution there is)

There are some commands that could help like:

git show :/[System] - which shows the last git message matching the string

git-filter-branch

and some other, but I am not sure how I could do that in a script-like (or any) form.

Community
  • 1
  • 1
Aleks
  • 4,866
  • 3
  • 38
  • 69

2 Answers2

0

You can use git commit --amend to change the author to a commit

while rebasing use

edit

then use the command:

git commit --amend --author="Author Name <email@address.com>"

then use:

git rebase --continue

You can specify the commit number in rebase to modify a particular commit

slal
  • 2,657
  • 18
  • 29
  • Do you have example with specifying the git commit? Also I don't follow the first comment about using `edit`. Where do I use edit? – Aleks May 07 '17 at 19:12
  • Here are the complete steps: `git rebase origin/master -i.` It will open a editor with all your commits listed. Place `edit` instead of pick in front of the commits you want to edit. Now the rebasing will halt and then you type in the following command to edit that commit. `git commit --amend --author="Author Name ".` then do the git rebase --continue `git rebase --continue.` Make sure you try the rebasing on a fork first, because rebasing rewrites history. – slal May 07 '17 at 19:24
  • Thanks for the detailed explanation, but I wanted to do this without the `-i` (`interactive`) flag. I was wondering if there is a solution in that case. I will edit the question to point this more out. – Aleks May 07 '17 at 19:32
0

The answer is the same, with all the same caveats, as in the duplicate that sajib khan noted. You just change the test. Look at the accepted answer there, which contains these parts:

git filter-branch --env-filter '
OLD_EMAIL="your-old-email@example.com"
CORRECT_NAME="Your Correct Name"
CORRECT_EMAIL="your-correct-email@example.com"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]

Note that this first step tests the original commit's "committer email" string. You wish to test the original commit's commit message, possibly along with the original commit's author and/or committer.

if <some-test>

The tricky part is the test itself, since the original commit's message is not part of the environment.

If your test passes, you want to change both the author and the committer, so this is not quite right:

then
    export GIT_COMMITTER_NAME="$CORRECT_NAME"
    export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
then
    export GIT_AUTHOR_NAME="$CORRECT_NAME"
    export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi

but you can simply remove the second test and override both author and committer:

then
    export GIT_COMMITTER_NAME="$CORRECT_NAME"
    export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
    export GIT_AUTHOR_NAME="$CORRECT_NAME"
    export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi

Finally, the last bit of the command will remain the same, e.g.:

' --tag-name-filter cat -- --branches --tags

Hence, the problem boils down to the <test>. Here you must make some decisions:

  • Do you want to change (the copy of) any commit that has the literal string [System] anywhere within it? Or, do you want to change only commits that begin with [System]?

  • Do you want to change these copies only when the original commit's author and/or committer name and/or email are match your own? Or do you want to change them regardless of the original author and/or committer? What if the author matches your name but the committer does not, or the committer matches your name but the author does not?

  • Would you like to compute the matching-ness "on the fly" in the filter, or would you prefer to pre-compute all the commit hash IDs, then apply the change to those particular commits?

When thinking about your answers, remember that the way git filter-branch operates is that it copies every commit that you direct it to (--branches means "every commit reachable from a branch name"). As it copies each such commit, it applies each of your filters. There are many; see the git filter-branch documentation for the list.

In effect, the filter-branch code extracts the original commit, makes any requested changes by applying each filter in the appropriate order, then makes a new commit from the result. If the new commit, including the new commit's parent hash ID, is bit-for-bit identical to the original commit, the new commit is the original commit. However, as soon as one commit somewhere in a chain gets modified in any way, the new commit gets a new, different hash ID. This forces all subsequent commits to have at least one thing different, namely their parent commit hash, so once there is at least one change, the change "ripples through" the remaining commits.

The result is a new repository that is no longer compatible with the original repository. All clones of the original have to be cast aside and replaced with new clones of the new repository, with its new hash IDs.

If you choose to answer the third question, "would you like to pre-compute hash IDs to change", with "yes"—this allows you to be much more certain what your filter will do, and that it will not touch any unwanted commits—then you will have to re-compute the pre-computed hash IDs if you need to change things again after a previous change, since the hash IDs will change each time you make any changes. If you choose instead "no, I will test dynamically", this becomes less of a problem, but your test code has to be correct or you will modify the copies of commits you do not intend to modify. You can, of course, carefully inspect the new, modified repository to make sure it is correct. If not, instead of casting aside the originals in favor of the copy, cast the copy aside and start the filtering over again with an improved test.

Let us now consider the <test>. If you have prepared a complete list of all the commit IDs you wish to change, the test will be: "is the to-be-copied commit's hash one of the hashes in the list?" The to-be-copied commit's hash is available, in all filter-branch filters, as $GIT_COMMIT:

Filters

The filters are applied in the order as listed below. The <command> argument is always evaluated in the shell context using the eval command (with the notable exception of the commit filter, for technical reasons). Prior to that, the $GIT_COMMIT environment variable will be set to contain the id of the commit being rewritten. Also, GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME, GIT_COMMITTER_EMAIL, and GIT_COMMITTER_DATE are taken from the current commit and exported to the environment, in order to affect the author and committer identities of the replacement commit created by git-commit-tree(1) after the filters have run.

So if your commit IDs to affect are in a file, one per line, you could use grep to see if $GIT_COMMIT matches one of those lines:

if grep $GIT_COMMIT /tmp/list-of-commits

(as a side effect, if grep matches one of those commits, the commit's hash will be printed on standard output, which you will see during the filtering).

If you wish to choose commits dynamically, it's a bit harder. You must extract the commit log message given the commit's hash. You can do this with git log:

git log --no-walk --pretty=format:%B $GIT_COMMIT

to get the entire message, or:

git log --no-walk --pretty=format:%s $GIT_COMMIT

to get just the subject line. You can then feed that through a matching program, such as grep again, to decide whether the message contains, or begins with, a string.

Since grep does regular expressions and square brackets are regular-expression characters, you might want to use fgrep (fixed-string grep) instead. Using fgrep means it is impossible to verify that the open square bracket occurs at the beginning of the subject line, though, if you wish to do that.

Hence, some ways to detect [System] within or at the start of the log message are:

if git log --no-walk --pretty=format:%B $GIT_COMMIT | fgrep "[System]"

or:

if git log --no-walk --pretty=format:%s $GIT_COMMIT | grep "^\[System]"

Remember, in sh and bash, if ... simply runs the command after the word if, then looks at its exit status. An exit status of zero means "yes": the test succeeds. A nonzero exit status means "no": the test fails. The test:

if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]

is just running the [ program, with arguments:

$GIT_COMMITTER_EMAIL     (spaces retained as a single argument)
=
$OLD_EMAIL               (spaces likewise protected by double quote)
]

The /bin/[ program, also known as /bin/test, looks for the close ] as the last argument, and removes it. (If invoked as test, it does not look for the close ].) It then performs the test prescribed by the remaining arguments. In this case, the arguments are two strings with an = between them, so test tests whether the two strings are equal.

We simply replace the test with grep (to see if $GIT_COMMIT is in a file of commit IDs) or with git log ... | grep (to see if the git log output contains a string). The exit status of grep tells us whether the string we are looking for was found, or not.

If you use a prepared file of commit hashes, you can try many different methods of collecting the hash IDs of the commits you want changed, without a lot of pain, as you will be working in a much more familiar (and probably richer) programming environment than the one available inside the eval-ed filters of git filter-branch.

One more caveat

This probably should be noted in the other question's answer, but any time you are going to run git filter-branch, it's a good idea to run it on a new, fresh clone of the repository. That way, you can inspect the new commits carefully and see whether your filters did what you expected them to do. If not, you can throw away the filtered clone and try again, since the only thing you have spent on it is the time and disk space needed to clone and filter.

Community
  • 1
  • 1
torek
  • 448,244
  • 59
  • 642
  • 775
  • +1 for such detailed answer, it gives a number of valuable information. It might not fit all my needs, but definitely gives a number of information. The comment is not so big to comment all the places in your answer, but thanks for the answer. At the end I think I will end up in creating a bash script for manipulating/editing the commits, rather then using multiple single commands – Aleks May 09 '17 at 09:54