I have two .csv files and I would like to query column 3 from File 1 against column 3 from File 2. The output file should consist of column 3 from File 1, and if the entry exists in File 2, print the entire corresponding row from File 2. If no match in File 2, print File 1 column 3 and leave rest blank (see below).
File 1:
... ... a ...
... ... e ...
... ... b ...
... ... c ...
File 2:
... ... a a-info-1 a-info-2 a-info-n
... ... c c-info-1 c-info-2 c-info-n
... ... d d-info-1 d-info-2 d-info-n
... ... e e-info-1 e-info-2 e-info-n
... ... f f-info-1 f-info-2 f-info-n
Desired output:
a ... ... a-info-1 a-info-2 a-info-n
e ... ... e-info-1 e-info-2 e-info-n
b
c ... ... c-info-1 c-info-2 c-info-n
I have tried to accomplish this in both R and bash. I thought I should be able to figure this out by referencing this thread: awk compare 2 files, 2 fields different order in the file, print or merge match and non match lines, but I am very much new to all things programming and can't seem to figure out how to transcribe the solution to my case.
My best try:
awk -F"," 'NR==FNR{a[$3]=$3;next}{if ($3 in a)print a[$3]","$0;}' file1.csv file2.csv > output.csv
The problem with this code is that it does not print the entries from File 1 that do not have entries in File 2.
If your solutions could be overly explanatory, I would very much appreciate it!