I have a database with 7 columns (file.txt). I have a list with names (names.txt). I want to count the lines in file.txt where a name from names.txt appears both in column 3 and 4. Said in another way, I don't want to count the lines where the name appears only in one column of file.txt or it doesn't appear at all. How can I do that in unix? Thanks.
Asked
Active
Viewed 379 times
1 Answers
1
awk -F, 'BEGIN {
while ((getline name < "names.txt") > 0) {
names[name] = 1
}
close("names.txt")
count = 0
}
$3 in names && $4 in names { count++ }
END { print count }' file.txt

rob mayoff
- 375,296
- 67
- 796
- 848
-
Thank you. where should I specify that file.txt is comma separated? sorry for this question. – user2993492 Nov 20 '13 at 01:09
-
Thank you. It worked perfectly. Now I know that I will never be able to extract all these lines with my code. I launched my script almost 12 hours ago and it extracted 4 million lines, which is less than a half. Thank you so much. – user2993492 Nov 20 '13 at 15:14