2

I have two CSV files. 1.csv files has 718 entries and 2.csv has 68000 entries.

#cat 1.csv
#Num    #Name  
 1      BoB
 2      Jack
 3      John
 4      Hawk
 5      Scot
 ...........

#cat 2.csv
#Num #Name
1   BoB
2   John
3   Linda
4   Hawk
5   Scot
........

I knew how to compare two files,when only one column(Names) is available in both and to get the matching names.

#comm -12 <(sort 1.csv) <(sort 2.csv)

Now i would like to check, If Num in 1.csv is matching with Num in 2.csv, What is the associated "Names" from both the csv files for that matched Num ?

Result : 

1,Bob,Bob
2,Jack,John
3,John,Linda
4,Hawk,Hawk
5,Scot,Scot
..........

How to do achieve this using comm ?

Arun
  • 1,160
  • 3
  • 17
  • 33

2 Answers2

2

You can use the join command to perform inner join on 2 csv files on the 1st field i.e the number. Here is an example:

$ cat f1.csv 
1      BoB
2      Jack
3      John
4      Hawk
5      Scot
6      ExtraInF1
$ cat f2.csv 
1   BoB
3   Linda
4   Hawk
2   John
5   Scot
7   ExtraInF2
$ join <(sort -t ' ' -k 1 f1.csv) <(sort -t ' ' -k 1 f2.csv)
1 BoB BoB
2 Jack John
3 John Linda
4 Hawk Hawk
5 Scot Scot
$ join <(sort -t ' ' -k 1 f1.csv) <(sort -t ' ' -k 1 f2.csv) | tr -s ' ' ,
1,BoB,BoB
2,Jack,John
3,John,Linda
4,Hawk,Hawk
5,Scot,Scot
$

Note I have added few dummy rows(number 6 and 7) and also note that they haven't appeared in the output as they aren't present in both files.

<(sort -t ' ' -k 1 f1.csv) means process substitution i.e substitute the output of the process at this place. sort with delimiter as space(-t ' ') and on 1st key i.e 1st column(-k 1) and join by default performs inner join on 1st column of both files.

riteshtch
  • 8,629
  • 4
  • 25
  • 38
1

Another one-liner for inner join using the join command

join -1 1 -2 1 <(sort 1.csv) <(sort 2.csv) |  tr -s ' ' ,

-1 2   : sort on file 1, 1st field
-2 1   : sort on file 2, 1st field

tr -s squeezes multiple spaces into a single space and replaces it by a comma(,)

riteshtch
  • 8,629
  • 4
  • 25
  • 38
Inian
  • 80,270
  • 14
  • 142
  • 161