I have a file in which I have to merge 2 rows on the basis of:
- Common sessionID
- Immediate next matching pattern (GX with QG)
file1:
session=001,field01,name=GX1_TRANSACTION,field03,field04
session=001,field91,name=QG
session=001,field01,name=GX2_TRANSACTION,field03,field04
session=001,field92,name=QG
session=004,field01,name=GX1_TRANSACTION,field03,field04
session=002,field01,name=GX1_TRANSACTION,field03,field04
session=002,field01,name=GX2_TRANSACTION,field03,field04
session=002,field92,name=QG
session=003,field91,name=QG
session=003,field01,name=GX2_TRANSACTION,field03,field04
session=003,field92,name=QG
session=004,field91,name=QG
session=004,field01,name=GX2_TRANSACTION,field03,field04
session=004,field92,name=QG
I have created an awk (I am new and learnt awk only from This portal only) which created my desired output.
Output1
session=001,field01,name=GX1_TRANSACTION,field03,field04,session=001,field91,name=QG
session=001,field01,name=GX2_TRANSACTION,field03,field04,session=001,field92,name=QG
session=002,field01,name=GX1_TRANSACTION,field03,field04,NOMATCH-QG
session=002,field01,name=GX2_TRANSACTION,field03,field04,session=002,field92,name=QG
session=003,field01,name=GX2_TRANSACTION,field03,field04,session=003,field92,name=QG
session=004,field01,name=GX1_TRANSACTION,field03,field04,session=004,field91,name=QG
session=004,field01,name=GX2_TRANSACTION,field03,field04,session=004,field92,name=QG
Output2: Pending
session=003,field91,name=QG
Awk:
{
if($0~/name=GX1_TRANSACTION/ || $0~/GX2_TRANSACTION/) {
if($1 in ccr)
print ccr[$1]",NOMATCH-QG";
ccr[$1]=$0;
}
if($0~/name=QG/) {
if($1 in ccr) {
print ccr[$1]","$0;
delete ccr[$1];
}
else {
print $0",NOUSER" >> Pending
}
}
}
END {
for (i in ccr)
print ccr[i]",NOMATCH-QG"
}
Command:
awk -F"," -v Pending=t -f a.awk file1
But Issue is my "file1" is really big, So I want to improve the performance of this script. Is their any way by which I can improve its performance?