-1

I have two tab separated values file, say

File1.txt

chr1    894573  rs13303010  GG
chr2    18674   rs10195681  **CC**
chr3    104972  rs990284    AA  <--- Unique Line
chr4    111487  rs17802159  AA
chr5    200868  rs4956994   **GG**
chr5    303686  rs6896163   AA  <--- Unique Line
chrX    331033  rs4606239   TT
chrY    2893277 i4000106    **GG**
chrY    2897433 rs9786543   GG
chrM    57  i3002191    **TT**


File2.txt

chr1    894573  rs13303010  GG
chr2    18674   rs10195681  AT
chr4    111487  rs17802159  AA
chr5    200868  rs4956994   CC
chrX    331033  rs4606239   TT
chrY    2893277 i4000106    GA
chrY    2897433 rs9786543   GG
chrM    57  i3002191    TA

Desired Output:

Output.txt

chr1    894573  rs13303010  GG
chr2    18674   rs10195681  AT
chr3    104972  rs990284    AA  <--Unique Line from File1.txt
chr4    111487  rs17802159  AA
chr5    200868  rs4956994   CC
chr5    303686  rs6896163   AA  <--Unique Line from File1.txt
chrX    331033  rs4606239   TT
chrY    2893277 i4000106    GA
chrY    2897433 rs9786543   GG
chrM    57  i3002191    TA

File1.txt has total 10 entries while File2.txt has 8 entries. I want to compare the both the file using Column 1 and Column 2.

If both the file's first two column values are same, it should print the corresponding line to Output.txt from File2.txt.

When File1.txt has unique combination (Column1:column2, which is not present in File2.txt) it should print the corresponding line from File1.txt to the Output.txt.

I tried various awk and perl combination available at website, but couldn't get correct answer. Any suggestion will be helpful.

Thanks, Amit

Amit Goyal
  • 29
  • 3

1 Answers1

0

next time, show your awk code tryso we can help on error or missing object

awk 'NR==FNR || (NR>=FNR&&($1","$2 in k)){k[$1,$2]=$0}END{for(K in k)print k[K]}' file1 file2
NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
  • It looks like its printing the information from the first file to the result when there is a match, not from the second file. – Amit Goyal Dec 09 '15 at 00:29
  • you don't specify to print line of second file not in first one. It should ? – NeronLeVelu Dec 10 '15 at 13:20
  • Infact Yes! In the case of match (First two columns) it should print from File2.txt. – Amit Goyal Dec 11 '15 at 02:49
  • I tried some dirty combination, e.g. Merging both file, then find the uniq lines (means Uniq first two columns), add the Uniq lines to File2.txt and Finally Sorting the Updated File2.txt to get the Output file. All this is working fine, but I am still trying to get it directly just by comparison. Thanks, Amit – Amit Goyal Dec 11 '15 at 02:51
  • sorry for my understanding of *if match* that only occur if the peer in file2 is already existing in file1. So you want to print also peer from file2 not in file1 – NeronLeVelu Dec 11 '15 at 07:54
  • Yeah, if the peer in file2 is already existing in file1 it should print from file2 to output, else it should print from file1. Thanks – Amit Goyal Dec 14 '15 at 00:44
  • question was *print also peer from file2 not in file1* – NeronLeVelu Dec 14 '15 at 06:50