I'm using awk to merge multiple (>3) files, and I want to keep the headers. I found a previous post that does exactly what I need, but I don't quite understand what's happening. I was hoping someone could walk me through it so I can learn from it! (I tried commenting on the original post but did not have enough reputation)
This code
awk '{a[FNR]=((a[FNR])?a[FNR]FS$2:$0)}END{for(i=1;i<=FNR;i++) print a[i]}' f*
transforms the input files as desired. See example tables below.
Input files:
file1.txt:
id value1
a 10
b 30
c 50
file2.txt:
id value2
a 90
b 30
c 20
file3.txt:
id value3
a 0
b 1
c 25
desired output
merge.txt:
id value1 value2 value3
a 10 90 0
b 30 30 1
c 50 20 25
Again, here's the code
awk '{a[FNR]=((a[FNR])?a[FNR]FS$2:$0)}END{for(i=1;i<=FNR;i++) print a[i]}' f* > merge.txt
I'm having trouble understanding the first part of the code {a[FNR]=((a[FNR])?a[FNR]FS$2:$0)}
, but understand the loop in the second part of the code.
I think in the first part of the code, an array is being established. The code runs through and check for matching records on the first column id
, and if there's a match then append the second column ($2
) value
and print the entire record ($0
).
But...I don't understand the beginning syntax. When is it established that the first column id
is the same across all three files and to only add the second column?