Code for the given question in shell script

Question

Write a shell script program to search for a keyword in all the files in current folder and display the count of occurrence in each file.

This really sounds like the text from an assignment. What exactly is your problem? Maybe have a look at the "grep" command. — , Feb 11 '20 at 15:53
Is this a homework question? If so you should post what you tried. — , Feb 11 '20 at 19:16
If this is an interview question, or a question from an instructor, the correct answer is "what's a folder?", followed by standing up and walking out. — William Pursell, Feb 11 '20 at 19:17
I’m voting to close this question because it's asking help for homework — f.khantsis, Sep 29 '22 at 15:49
Does this answer your question? [How to count occurrences of a word in all the files of a directory?](https://stackoverflow.com/questions/6135065/how-to-count-occurrences-of-a-word-in-all-the-files-of-a-directory) — tripleee, Nov 10 '22 at 05:31

vishnudattan · Answer 1 · 2020-02-11T19:15:18.900

Depending on the expected output, you may need to play around with grep options

Here's a possible solution:

grep -o pattern * | awk -F: '{a[$1]++;}END{for (i in a)print i, a[i];}'

Explanation:

Consider the search pattern to be string abc

I made the following assumptions, as these are not explicitly stated in your question:

Count multiple occurrences of the pattern for each line inside each file
Count occurrences of pattern that could occur within a word and/or is surrounded by other characters
Print file name and count in the output

I created the following test files in the same directory that you intend to run your search on:

file1 with one occurrence of the pattern abc : expected count = 1
```
cat > file1
abc
xyz
```
file2 with multiple occurences of pattern abc in a single line : expected count = 2
```
cat > file2
abc abc
xyz
```
file3 with pattern embedded in a word / surrounded by other characters : expected count = 5
```
cat > file3
xabcyz
xabcyabc
123abc
abc_
```

Step 1:

Use grep -o abc * to generate the following output:

    file1:abc
    file2:abc
    file2:abc
    file3:abc
    file3:abc
    file3:abc
    file3:abc
    file3:abc

What does -o option do?

    -o, --only-matching
    Print only the matched (non-empty) parts of a matching line, 
    with each such part on a separate output line.

man grep to explore more grep options..

Step 2:

(Note: Although this is not directly related to your question, I'm including an explanation on how the above output is aggregated to return counts)

Consider output of Step 1 as an associative array a with the index being file1, file2 and so on

Pipe the output with awk -F: '{a[$1]++;}END{for (i in a)print i, a[i];}'

In the awk command:

Specify the field separator F as : as we're only interested in the first column (i.e index of the array)
a[$1]++ increments the count of the index as you traverse through the array
END{actions} executes what you specify in actions before exiting
for (i in a)print i, a[i]; is a for loop that prints each index i from array a with the respective count a[i]

Final output:

    $ grep -o abc *  | awk -F: '{a[$1]++;}END{for (i in a)print i, a[i];}'
    file1 1
    file2 2
    file3 5

Hope this helps.

Code for the given question in shell script

1 Answers1