Using grep, to search the files in the directories, how can I get the occurrence count of each word using a list.txt(or csv)?

Question

In the list.txt I have:

Lucas
Viny
Froid

In current directory, I have a lot csv files containing names.

I need to know how many times each word of my list appears on these csv files.

I tried:

grep -riohf list.txt . | wc -lw

But it return only the counts. I need to know which word each count refers to.

I just need something like that:

Lucas 353453
Viny 9234
Froid 934586

Are all csv files having same input format. Share sample of 1 or 2 csv files — anubhava, Nov 18 '21 at 16:02
Yes, all the csv files have the same format. But they are very huge, with a lot columns. — Ébano Assumpção, Nov 18 '21 at 16:35
I have never used awk before. I will study about it. Thanks. — Ébano Assumpção, Nov 18 '21 at 16:49

dawg · Answer 1 · 2021-11-18T18:28:42.440

Suppose you have these files:

$ cat list.txt
Lucas
Viny
Froid

$ cat 1.csv
Lucas,Viny,Bob
Froid

$ cat 2.csv
Lucas,Viny,Froid
Lucas,Froid

You can use the following awk to count the fields that match a list:

awk -F ',' 'FNR==NR{cnt[$1]; next}
{for (i=1; i<=NF; i++) if ($i in cnt) cnt[$i]++}
END{for (e in cnt) print e, cnt[e]}' list.txt {1..2}.csv
Viny 2
Lucas 3
Froid 3

Yet another way is to use a pipeline to count uniq fields:

cat {1..2}.csv | tr , "\n" | sort | uniq -c
   1 Bob
   3 Froid
   3 Lucas
   2 Viny

Then grep that:

cat {1..2}.csv | tr , "\n" | grep -Fxf list.txt | sort | uniq -c
   3 Froid
   3 Lucas
   2 Viny

These do **not** work with quoted csv. If you have that, use a csv parser such as ruby, python, or perl. — dawg, Nov 18 '21 at 17:32

score 1 · Accepted Answer · answered Nov 18 '21 at 16:55

1

Using grep and wc within a loop, you can count each individual occurrence of a word rather than just the lines.

while read -r line; do
    count=$(grep -o "$line" *.csv | wc -l)
    echo "$line $count"
done < list.txt

answered Nov 18 '21 at 16:55

HatLess

10,622
5
14
32

Using grep, to search the files in the directories, how can I get the occurrence count of each word using a list.txt(or csv)?

I just need something like that:

2 Answers2