Here a one-liner using awk
that prints the word counts and the total:
awk 'NR==FNR{w[$1];next}{for(i=1;i<=NF;i++)if($i in w)w[$i]++}END{for(k in w){print k,w[k];s+=w[k]}print "Total",s}' file1 file2
hello 13
foo 20
world 13
baz
bar 20
Total 66
Note: uses Kents example input.
The more readable script version:
BEGIN {
OFS="\t" # Space the output with a tab
}
NR==FNR { # Only true in file1
word_count[$1] # Build keys for all words
next # Get next line
}
{ # In file2 here
for(i=1;i<=NF;i++) # For each word on the current line
if($i in word_count) # If the word has a key in the array
word_count[$i]++ # Increment the count
}
END { # After all files have been read
for (word in word_count) { # For each word in the array
print word,int(word_count[word]) # Print the word and the count
sum+=word_count[word] # Sum the values
}
print "Total",sum # Print the total
}
Save as script.awk
and run like:
$ awk -f script.awk file1 file2
hello 13
foo 20
world 13
baz 0
bar 20
Total 66