5

basic Awk question, but I can't seem to find an answer anywhere:

I have a folder of 50000 txt files, from which I would like to run AWK searches on a subset. I've saved the filenames I want to limit the search to in a separate document. This would greatly speed up the search, which at the moment looks like this:

awk -F "searchTerm" '{print NF-1}' data/output/*>> output.txt

Many thanks

Rolf Fredheim
  • 300
  • 3
  • 9

3 Answers3

1

Suppose that your file containing the subset that you want to search is called subset.txt and its content has this format (each file on a separate line):

file1.txt
file2.txt
file3.txt
...
fileN.txt

Then this will do the trick:

awk -F "searchTerm" '{print NF-1}' $(<subset.txt) >> output.txt

Explanation:

  • $(<subset.txt) will supply the subset list of files to awk as input. (See Jonathan Leffler's comment below)

I should also point out that -F "searchTerm" is actually setting the Field Separator (limiter used by awk on each line) to searchTerm. If you want to print the Number of Fields - 1 on each line that contains "searchTerm", do:

awk '/searchTerm/ {print NF-1}' $(cat subset.txt) >> output.txt
sampson-chen
  • 45,805
  • 12
  • 84
  • 81
0

I think this will work for you.

awk '/searchTerm/{print $(NF-1)}' data/output/*>> output.txt
ddoxey
  • 2,013
  • 1
  • 18
  • 25
0

if you have your lists in a file called filelist.txt you could just use the stdout from a cat command.

 awk -F "searchTerm" '{print NF-1}' `cat data/output/filelist.txt` >> output.txt`
jeffpkamp
  • 2,732
  • 2
  • 27
  • 51