3

My intention: Search trough source code and find keywords of interest. This is done to automate a small portion of codereview to find obvious programming errors like hardcoded keys and passwords.

I currently have the following grep command to search trough code for certain words:

while read p; do
    echo "FOUND: ${p}"
    grep -riIn -A 5 -B 5 ${p} "${SEARCHPATH}"
done < "${SEARCHWORDS}"

SEARCHWORDS is actually a file location with a list containing searchwords. SEARCHPATH is the folder which grep should search in. The output it generates is as following:

xo.java-33-    default: 
xo.java-34-      return str;
xo.java-35-    case -4501: 
xo.java-36-      return "Internal error";
xo.java-37-    case -4502: 
xo.java:38:      return "Activation password too long. Limited to 512 characters.";
xo.java-39-    case -4503: 
xo.java-40-      return "CHS key null or empty. Must be a 32 hexadecimal string.";
xo.java-41-    case -4504: 
xo.java-42-      return "Incorrect CHS key length. Must be a 32 hexadecimal string.";
xo.java-43-    case -4505: 

As you can see, it also gives the lines above and below, this to give me some context and see if it is a false positive. But I would like to have the following output:

Found "password" in file "xo.java":

    xo.java-33-    default: 
    xo.java-34-      return str;
    xo.java-35-    case -4501: 
    xo.java-36-      return "Internal error";
    xo.java-37-    case -4502: 
    xo.java:38:      return "Activation password too long. Limited to 512 characters.";
    xo.java-39-    case -4503: 
    xo.java-40-      return "CHS key null or empty. Must be a 32 hexadecimal string.";
    xo.java-41-    case -4504: 
    xo.java-42-      return "Incorrect CHS key length. Must be a 32 hexadecimal string.";
    xo.java-43-    case -4505: 

I want the found search word on top of it, so all instances are kind of grouped together with their found keyword.

If you have suggestions on other tools, feel free to share them. I tried the command ack, but I couldn't achieve the result as I describe here.

user5989986
  • 73
  • 2
  • 9
  • Isn't the keyword coloring enough/better? If you are piping your output to a file and color is off, you can still force it with `--color=always`, then view the file with `less -R`. – randomir Oct 10 '17 at 18:45

1 Answers1

1

I have not yet tested this solution, and a better (more elegant) solution would exist, but this is what I would do:

while read p
do

    # For each found result, do...
    grep -riIn ${p} "${SEARCHPATH} | while read -r line ; do

        # Split array on ':' into an array 
        # element 0 is relative path to file
        # element 1 is line number of match
        IFS=':' read -r -a array <<< "${line}"

        # Print your header
        echo "FOUND '${p}' in '${array[0]}' on line ${array[1]}"
        echo -e "\n"

        # Calculate ranges (number of lines before and after match)
        from_line_nr=$((${array[1]}-5))
        to_line_nr=$((${array[1]}+5))
        # limit ranges if the result is not a valid line number
        # sed can handle numbers bigger than the number of lines in the file
        # so we only need to make sure our lower limit equals to 
        if [ "${from_line_nr}" -lt "1" ]; then from_line_nr=1; fi

        # show lines before and after match using sed
        sed -n "${from_line_nr},${to_line_nr}p" "${array[0]}"

        # Add some white lines to improve readability
        echo -e "\n\n"
    done

done < "${SEARCHWORDS}"

Instead of using the parameters -A and -B (or -C in short), I grep for only the line of the match, than I print the header that you want and continue printing the context of the found match using sed.

Kevin De Koninck
  • 144
  • 1
  • 12