Git add lines to index by grep/regex

Question

I have a giant patch that I would like to break into multiple logical git commits. A large number of the changes are simply changing variable names or function calls, such that they could easily be located with a grep. If I could add to the index any changes that match a regex then clean up in git gui, it would save me a lot of manual work. Is there a good way to update the index on a line-by-line basis using a regex within git or from some output of grep (e.g. line numbers)?

I found a similar question, but I'm not sure how to build the temporary file from a regex-type search.

Some examples would probably go a long way in clarifying what you're trying to achieve. — rvalvik, Mar 05 '13 at 22:09
Git's atomic unit of work is an entire file, that is, you stage (or don't stage) an entire _file_ in your project, not individual lines. — Tim Biegeleisen, Nov 29 '22 at 07:02
Can you write a script, which takes the version of your file in the last commit, which may compare it with the content of your current file on disk and which produces the version you want to commit ? — LeGEC, Nov 29 '22 at 07:04
@LeGEC the problem is there maybe some changes in that same file that I don't want to stage — Samantha, Nov 29 '22 at 07:11
Never tried it, but I think `git patch` is the command to use here. It apples the differences to be specified either in the _diff_ or in the _ed_ syntax. — user1934428, Nov 29 '22 at 07:15
For example : are all the changes you mention (the ones containing `MARKETING_VERSION`) lines that are *modified in place* ? or are there also some new inserted lines ? some deleted lines ? — LeGEC, Nov 29 '22 at 07:16
@LeGEC they are usually only modified in place (it's a generated xcode file) — Samantha, Nov 29 '22 at 07:26
@TimBiegeleisen adding can be done with `-p` or `-i` so that you can select specific parts of the file to add to index (quite cool, by the way). — eftshift0, Nov 29 '22 at 09:24
@eftshift0 That's wild. Maybe add an answer. I have never even heard of this. — Tim Biegeleisen, Nov 29 '22 at 09:31
See also staging or removing by regexp: https://stackoverflow.com/a/22959015/411282 — Joshua Goldberg, Apr 05 '23 at 14:42

Gary van der Merwe · Answer 1 · 2016-04-26T07:24:49.507

35

patchutils has a command grepdiff that can be use to achieve this.

# check that the regex search correctly matches the changes you want.
git diff -U0 | grepdiff 'regex search' --output-matching=hunk  

# then apply the changes to the index
git diff -U0 | grepdiff 'regex search' --output-matching=hunk | git apply --cached --unidiff-zero

I use -U0 on the diff to avoid getting unrelated changes. You might want to adjust this value to suite your situation.

edited Apr 26 '16 at 07:24

answered Apr 09 '14 at 09:57

Gary van der Merwe

9,134
3
49
80

3

This worked pretty well for me but, as a note, `hunk` doesn't really provide enough granularity if you have modified neighboring lines that don't match the regex. – arcyqwerty Nov 23 '14 at 23:04
I find a good way is to just run the second line and then check the results. `git status -v` for the staged stuff and `git diff` to see all the things that didn't get staged. Then commit if all looks correct. – Gerry Aug 22 '17 at 01:45
This is also handy with `git apply --reverse` to undo some changes in the working directory. – Joshua Goldberg Apr 05 '23 at 14:42

score 5 · Answer 2 · answered Jan 03 '18 at 15:56

5

More simply, you can use git add -p and utilize the / option to search through your diff for the patches to add. Its not totally automated, but its easier than other alternatives I've found.

answered Jan 03 '18 at 15:56

score 2 · Answer 3 · answered Nov 29 '22 at 10:50

What git add -p <file> does is, very roughly, this:

tmpfile=$(mktemp)
tf2=$(mktemp)
tf3=$(mktemp)
git diff <file> > $tmpfile
while [ -s $tmpfile ]; do
    extract first diff hunk from $tmpfile to $tf2 and rest to $tf3
    show you $tf2, ask if you want to include this hunk
        (with options to edit the hunk, etc); repeat until ready
    if you say to *add* the hunk, run git apply --cached $tf2
    cat < $tf3 > $tf2
done
rm -f $tmpfile $tf2 $tf3

That is, git add -p uses git apply --cached (a specialized sub-variant of git apply --index that ignores the working tree copy of the file). The key takeaway you need, from the above, is this: There are three versions of the file!

The first one (completely ignored here) is frozen for all time and is in the HEAD commit.
The second one is in Git's index aka staging area. That's used by git diff above as the "old version".
The third one is in your working tree. That's used by git diff above as the "new version".

The patches that Git lets you take or skip are simply the result of comparing the "old" (index) and "new" (working tree) version. If you take some patch, Git updates the in-index copy by applying the patch.

Hence, if there are some set of lines in the working tree version (say, lines 100 through 110 inclusive) that you'd like to use to replace some other set of lines (say, lines 90 through 92 inclusive) in the index version, the way to construct that is:

extract the index version;
scrape out lines 1-89 from the index version; concatenate lines 100-110 from the working tree version; concatenate lines 93-end from the index version, all into a temporary file;
replace the index copy with the temporary file.

To read the index version, use git show or git cat-file -p with the name of the index version of the file. If the file's name is path/to/file, the index version's name is :path/to/file (short for :0:path/to/file: we want the copy in slot zero; there must not be a copy in slots 1, 2, or 3 so that there is a copy in slot 0; you can simply attempt to read it from slot zero, and if that fails, assume the file either isn't in the index, or is conflicted).

Reading the working tree file (some select subset of lines) is left as an exercise, as is the concatenation part, and any error checking you wish to include.

Assuming the final resulting file is in a temporary file named $tf (as a shell variable), to update the index copy, you must first make sure an appropriate blob hash ID exists:

hash=$(git hash-object -w -t blob --path="$path" -- "$tf")

for instance (this assumes you want to run the usual .gitattribute filters, if any, and know that the path is $path). Then, if that goes well, use that hash ID with git update-index:

git update-index --cacheinfo "$mode,$hash,$path"

where $mode is either 100644 or 100755 as appropriate for the file. If you don't want to change the mode, you can read the previous mode with git ls-files --cached or similar. Otherwise, provided core.fileMode is true, read the mode from the working tree copy of the file, to match the behavior of git add: convert "has any executable bit set" to 100755 and "has no executable bit set" to 100644. When core.fileMode is false—use git config --get --type bool core.filemode to read it—git add uses the existing mode for this add-patch case.)

score 0 · Answer 4 · answered Jun 27 '19 at 08:08

0

You could first run:

git status | \grep "your_pattern"

If the output is as intended, then add the files to the index:

git add $(git status | \grep "your_pattern")

answered Jun 27 '19 at 08:08

builder-7000

7,131
3
19
43

2

This is not the answer to the question. OP asks how to add specific lines but your answer says how to add specific files. – DeveloperKid Apr 16 '22 at 16:47
`git status` shows changed files, not the lines of code that were changed. I don't see how this answers the question. – Alex Jan 04 '23 at 22:47

score 0 · Answer 5 · answered Feb 28 '20 at 14:13

I'm working now on Git-Bash over Windows, and I got a similar problem: I didn't need add some few files from the "not staged for commit file list":

 $ git status
 On branch Bug_#292400_buggy
 Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)


    modified:   the/path/to/the/file333.NO   
    modified:   the/path/to/the/file334.NO 
    modified:   the/path/to/the/file1.ok
    modified:   the/path/to/the/file2.ok
    modified:   the/path/to/the/file3.ok
    modified:   the/path/to/the/file4.ok
    ....................................
    modified:   the/path/to/the/file666.ok

First, I checked if the file selection was what I was looking for:

$ git status | grep ok
            modified:   the/path/to/the/file1.ok
            modified:   the/path/to/the/file2.ok
            modified:   the/path/to/the/file3.ok
            modified:   the/path/to/the/file4.ok
            ....................................
            modified:   the/path/to/the/file666.ok

I tried with one idea as descibed in this dorum in order to add the same file list with git, as:

$ git add $(git status | \grep "your_pattern")

But it doesn't work for me (Remember: Git-Bash over Windows10)

At least, I tried in a straight way, and it worked fine:

$ git add *ok
$ git status
On branch Bug_#292400_buggy
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

            modified:   the/path/to/the/file1.ok
            modified:   the/path/to/the/file2.ok
            modified:   the/path/to/the/file3.ok
            modified:   the/path/to/the/file4.ok
            ....................................
            modified:   the/path/to/the/file666.ok

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   the/path/to/the/file333.NO   
        modified:   the/path/to/the/file334.NO

Ready to commit, so.

This is not the answer to the question. OP asks how to add specific lines but your answer says how to add specific files. — DeveloperKid, Apr 16 '22 at 16:47

yeop · Answer 6 · 2022-06-28T05:05:05.450

I found an answer.

There are some steps.

git status --porcelain gives git status easy-to-parse format for scripts like grep.
sed s/^...// slices from 3rd characters to end lines
xargs serves you to run script line by line

In my case, using django that need to ignore migrations, my script is git status --porcelain | sed s/^...// | grep -v migrations | xargs git add.

You can customize grep options to fit your needs

documents

xargs

git-status

sed

score -1 · Answer 7 · answered Apr 07 '13 at 01:36

-1

xargs is what your looking for. Try this:

grep -irl 'regex_term_to_find' * | xargs -I FILE git add FILE

Up to the pipe | is your standard grep command for searching all files *. Options are:

i - case insensitive
r - recursive through directories
l - list names of files only

In the xargs part of the statement FILE is the name of the variable to use for each argument/match passed by the grep command. Then enter the desired command using the variable where appropriate.

answered Apr 07 '13 at 01:36

RDL

7,865
3
29
32

9

Thanks, but I want to add just part of a file, not whole files. – FazJaxton Apr 18 '13 at 17:55
@FazJaxton, For that you would need to add as a patch. Use the `-p` parameter after `git add` however I'm not sure how it will work within xargs as the patch process is done by you (ie. selecting what should be added and what shouldn't). – RDL Apr 18 '13 at 18:17

Git add lines to index by grep/regex

7 Answers7

Linked

Related