0

In the list.txt I have:

Lucas
Viny
Froid

In current directory, I have a lot csv files containing names.

I need to know how many times each word of my list appears on these csv files.

I tried:

grep -riohf list.txt . | wc -lw

But it return only the counts. I need to know which word each count refers to.

I just need something like that:

Lucas 353453
Viny 9234
Froid 934586

2 Answers2

2

Suppose you have these files:

$ cat list.txt
Lucas
Viny
Froid

$ cat 1.csv
Lucas,Viny,Bob
Froid

$ cat 2.csv
Lucas,Viny,Froid
Lucas,Froid

You can use the following awk to count the fields that match a list:

awk -F ',' 'FNR==NR{cnt[$1]; next}
{for (i=1; i<=NF; i++) if ($i in cnt) cnt[$i]++}
END{for (e in cnt) print e, cnt[e]}' list.txt {1..2}.csv
Viny 2
Lucas 3
Froid 3

Yet another way is to use a pipeline to count uniq fields:

cat {1..2}.csv | tr , "\n" | sort | uniq -c
   1 Bob
   3 Froid
   3 Lucas
   2 Viny

Then grep that:

cat {1..2}.csv | tr , "\n" | grep -Fxf list.txt | sort | uniq -c
   3 Froid
   3 Lucas
   2 Viny
dawg
  • 98,345
  • 23
  • 131
  • 206
  • These do **not** work with quoted csv. If you have that, use a csv parser such as ruby, python, or perl. – dawg Nov 18 '21 at 17:32
1

Using grep and wc within a loop, you can count each individual occurrence of a word rather than just the lines.

while read -r line; do
    count=$(grep -o "$line" *.csv | wc -l)
    echo "$line $count"
done < list.txt
HatLess
  • 10,622
  • 5
  • 14
  • 32