0

I have a database with 7 columns (file.txt). I have a list with names (names.txt). I want to count the lines in file.txt where a name from names.txt appears both in column 3 and 4. Said in another way, I don't want to count the lines where the name appears only in one column of file.txt or it doesn't appear at all. How can I do that in unix? Thanks.

1 Answers1

1
awk -F, 'BEGIN {
    while ((getline name < "names.txt") > 0) {
        names[name] = 1
    }
    close("names.txt")
    count = 0
}
$3 in names && $4 in names { count++ }
END { print count }' file.txt
rob mayoff
  • 375,296
  • 67
  • 796
  • 848
  • Thank you. where should I specify that file.txt is comma separated? sorry for this question. – user2993492 Nov 20 '13 at 01:09
  • Thank you. It worked perfectly. Now I know that I will never be able to extract all these lines with my code. I launched my script almost 12 hours ago and it extracted 4 million lines, which is less than a half. Thank you so much. – user2993492 Nov 20 '13 at 15:14