The answer is the same, with all the same caveats, as in the duplicate that sajib khan noted. You just change the test. Look at the accepted answer there, which contains these parts:
git filter-branch --env-filter '
OLD_EMAIL="your-old-email@example.com"
CORRECT_NAME="Your Correct Name"
CORRECT_EMAIL="your-correct-email@example.com"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
Note that this first step tests the original commit's "committer email" string. You wish to test the original commit's commit message, possibly along with the original commit's author and/or committer.
if <some-test>
The tricky part is the test itself, since the original commit's message is not part of the environment.
If your test passes, you want to change both the author and the committer, so this is not quite right:
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
then
export GIT_AUTHOR_NAME="$CORRECT_NAME"
export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
but you can simply remove the second test and override both author and committer:
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
export GIT_AUTHOR_NAME="$CORRECT_NAME"
export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
Finally, the last bit of the command will remain the same, e.g.:
' --tag-name-filter cat -- --branches --tags
Hence, the problem boils down to the <test>
. Here you must make some decisions:
Do you want to change (the copy of) any commit that has the literal string [System]
anywhere within it? Or, do you want to change only commits that begin with [System]
?
Do you want to change these copies only when the original commit's author and/or committer name and/or email are match your own? Or do you want to change them regardless of the original author and/or committer? What if the author matches your name but the committer does not, or the committer matches your name but the author does not?
Would you like to compute the matching-ness "on the fly" in the filter, or would you prefer to pre-compute all the commit hash IDs, then apply the change to those particular commits?
When thinking about your answers, remember that the way git filter-branch
operates is that it copies every commit that you direct it to (--branches
means "every commit reachable from a branch name"). As it copies each such commit, it applies each of your filters. There are many; see the git filter-branch
documentation for the list.
In effect, the filter-branch code extracts the original commit, makes any requested changes by applying each filter in the appropriate order, then makes a new commit from the result. If the new commit, including the new commit's parent hash ID, is bit-for-bit identical to the original commit, the new commit is the original commit. However, as soon as one commit somewhere in a chain gets modified in any way, the new commit gets a new, different hash ID. This forces all subsequent commits to have at least one thing different, namely their parent commit hash, so once there is at least one change, the change "ripples through" the remaining commits.
The result is a new repository that is no longer compatible with the original repository. All clones of the original have to be cast aside and replaced with new clones of the new repository, with its new hash IDs.
If you choose to answer the third question, "would you like to pre-compute hash IDs to change", with "yes"—this allows you to be much more certain what your filter will do, and that it will not touch any unwanted commits—then you will have to re-compute the pre-computed hash IDs if you need to change things again after a previous change, since the hash IDs will change each time you make any changes. If you choose instead "no, I will test dynamically", this becomes less of a problem, but your test code has to be correct or you will modify the copies of commits you do not intend to modify. You can, of course, carefully inspect the new, modified repository to make sure it is correct. If not, instead of casting aside the originals in favor of the copy, cast the copy aside and start the filtering over again with an improved test.
Let us now consider the <test>
. If you have prepared a complete list of all the commit IDs you wish to change, the test will be: "is the to-be-copied commit's hash one of the hashes in the list?" The to-be-copied commit's hash is available, in all filter-branch filters, as $GIT_COMMIT
:
Filters
The filters are applied in the order as listed below. The <command>
argument is always evaluated in the shell context using the eval
command (with the notable exception of the commit filter, for technical
reasons). Prior to that, the $GIT_COMMIT
environment variable will be
set to contain the id of the commit being rewritten. Also,
GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME,
GIT_COMMITTER_EMAIL, and GIT_COMMITTER_DATE are taken from the current
commit and exported to the environment, in order to affect the author
and committer identities of the replacement commit created by
git-commit-tree(1) after the filters have run.
So if your commit IDs to affect are in a file, one per line, you could use grep
to see if $GIT_COMMIT
matches one of those lines:
if grep $GIT_COMMIT /tmp/list-of-commits
(as a side effect, if grep matches one of those commits, the commit's hash will be printed on standard output, which you will see during the filtering).
If you wish to choose commits dynamically, it's a bit harder. You must extract the commit log message given the commit's hash. You can do this with git log
:
git log --no-walk --pretty=format:%B $GIT_COMMIT
to get the entire message, or:
git log --no-walk --pretty=format:%s $GIT_COMMIT
to get just the subject line. You can then feed that through a matching program, such as grep
again, to decide whether the message contains, or begins with, a string.
Since grep
does regular expressions and square brackets are regular-expression characters, you might want to use fgrep
(fixed-string grep) instead. Using fgrep
means it is impossible to verify that the open square bracket occurs at the beginning of the subject line, though, if you wish to do that.
Hence, some ways to detect [System]
within or at the start of the log message are:
if git log --no-walk --pretty=format:%B $GIT_COMMIT | fgrep "[System]"
or:
if git log --no-walk --pretty=format:%s $GIT_COMMIT | grep "^\[System]"
Remember, in sh and bash, if ...
simply runs the command after the word if
, then looks at its exit status. An exit status of zero means "yes": the test succeeds. A nonzero exit status means "no": the test fails. The test:
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
is just running the [
program, with arguments:
$GIT_COMMITTER_EMAIL (spaces retained as a single argument)
=
$OLD_EMAIL (spaces likewise protected by double quote)
]
The /bin/[
program, also known as /bin/test
, looks for the close ]
as the last argument, and removes it. (If invoked as test
, it does not look for the close ]
.) It then performs the test prescribed by the remaining arguments. In this case, the arguments are two strings with an =
between them, so test
tests whether the two strings are equal.
We simply replace the test with grep
(to see if $GIT_COMMIT
is in a file of commit IDs) or with git log ... | grep
(to see if the git log
output contains a string). The exit status of grep
tells us whether the string we are looking for was found, or not.
If you use a prepared file of commit hashes, you can try many different methods of collecting the hash IDs of the commits you want changed, without a lot of pain, as you will be working in a much more familiar (and probably richer) programming environment than the one available inside the eval
-ed filters of git filter-branch
.
One more caveat
This probably should be noted in the other question's answer, but any time you are going to run git filter-branch
, it's a good idea to run it on a new, fresh clone of the repository. That way, you can inspect the new commits carefully and see whether your filters did what you expected them to do. If not, you can throw away the filtered clone and try again, since the only thing you have spent on it is the time and disk space needed to clone and filter.