-1

I'm doing a comparision of 2 files file1,file2 using first column in file1 to first column in file2 and retriving corresponding value from 7 th column .

awk -F, 'FNR==NR{a[$1]=$7;next} {print (($1 in a) ? $0","a[$1] : $0",NA");}' file2.txt file1.txt > tmp && mv tmp file1.txt

also on next day it will compare and append the result .

cat file1.txt 

N1,N2,N3,N4,N5,N6,D1,D2,D3,D4,D5,D6,D7,D8,D9,D10
XX,ZZ,XC,EE,RR,BB,OK,OK,OK,OK,OK,OK,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA,NA,NA,NA,NA,NA,NA,NA
DM,DF,GR,TH,EW,BB

cat file2.txt 

cat file2.txt
DF,GH,MH,FR,FG,GH,NA
XX,ZZ,XC,EE,RR,BB,OK

awk -F, 'FNR==NR{a[$1]=$7;next} {print (($1 in a) ? $0","a[$1] : $0",NA");}' file2.txt file1.txt > tmp && mv tmp file1.txt

mv: overwrite `file1.txt'? y

 cat file1.txt
N1,N2,N3,N4,N5,N6,D1,D2,D3,D4,D5,D6,D7,D8,D9,D10,NA ---> Header
XX,ZZ,XC,EE,RR,BB,OK,OK,OK,OK,OK,OK,OK,OK,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
DM,DF,GR,TH,EW,BB,NA

after adding new row

DM,DF,GR,TH,EW

problem is it is comparing and printing result for header also and result is printed under header D1 instead of D10 for newly inserted row in file1

How can we print like this, compare should exclude header and result under last column header

N1,N2,N3,N4,N5,N6,D1,D2,D3,D4,D5,D6,D7,D8,D9,D10
XX,ZZ,XC,EE,RR,BB,OK,OK,OK,OK,OK,OK,OK,OK,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
DM,DF,GR,TH,EW,BB                           ,NA
mpapec
  • 50,217
  • 8
  • 67
  • 127
XYZ
  • 47
  • 5

1 Answers1

0

To avoid having header updated, change awk's expression to the following:

'FNR==NR{a[$1]=$7;next} FNR==1{print $0; next} {print (($1 in a) ? $0","a[$1] : $0",NA");}'

In this case 1st line of the file1.txt will be printed as is, without any changes.

But don't you also need to have new day (like "D10" in the example) be added to the header on each run? Or you do it elsewhere?

As to the 2nd question (printing new value at the same position in the string for the shorter line as for the longer line), you should further modify awk:

'FNR==NR{a[$1]=$7;next} FNR==1{print $0; len=length($0); next} {printf $0; cont=(($1 in a) ? ","a[$1] : ",NA"); for (i=length($0)+1;i<=len-length(cont);i++) printf " " ; print cont;}'
striving_coder
  • 798
  • 1
  • 5
  • 7
  • You can remove the `$0` from `print $0`. It will do the default action, print the line. – Jotne Dec 17 '14 at 16:39
  • @Jotne: Yep, I know, but prefer this way for clarity and readability. – striving_coder Dec 17 '14 at 16:40
  • But how to combine both conditions ? – XYZ Dec 19 '14 at 08:14
  • Not printing result in Header and also printing the results of new row under recent header – XYZ Dec 19 '14 at 08:15
  • N1,N2,N3,N4,N5,N6,D1,D2,D3,D4,D5,D6,D7,D8,D9,D10 XX,ZZ,XC,EE,RR,BB,OK,OK,OK,OK,OK,OK,OK,OK,OK,OK XC,CF,FG,RG,GH,GH,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA DM,DF,GR,TH,EW,BB ,,,,,,,,,,,NA – XYZ Dec 19 '14 at 08:16
  • @sam_84: The second awk code line in my answer will do it for you, except that it won't be all the commas (`BB,,,,,NA`) before the last value in the updated row (as in your last comment) - it will rather be many spaces (as needed) followed by 1 comma and new value (`BB ,NA`) because that's what you asked for in your question. – striving_coder Dec 21 '14 at 22:00