3

I need help with following:

Input file:

abc message=sent session:111,x,y,z
pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z
pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z
abc message=sent session:589,x,y,z
pqr message=receive session:589,4,5,7

Output file:

abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7

Notes:

If you see in source file, for every "sent" message there is "receive"
only for session=342 there is no receive
session is unknow, can't be hardcoded
So merge only those sent and receive where we have matching session number

Steve
  • 51,466
  • 13
  • 89
  • 103
Vipin Choudhary
  • 331
  • 1
  • 2
  • 16

2 Answers2

1

Here's one way using awk. Run like:

awk -f script.awk file

Contents of script.awk:

{
    x = $0

    gsub(/[^:]*:|,.*/,"")

    a[$0] = (a[$0] ? a[$0] "," FS : "") x
    b[$0]++
}

END {
    for (i in a) {
        print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort"
    }
}

Results:

abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7

Alternatively, here's the one-liner:

awk '{ x = $0; gsub(/[^:]*:|,.*/,""); a[$0] = (a[$0] ? a[$0] "," FS : "") x; b[$0]++ } END { for (i in a) print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort" }' file

Note that you can drop the pipe to sort if you don't care about sorted output. HTH.

Steve
  • 51,466
  • 13
  • 89
  • 103
1

Another way:

awk -F "[:,]"  '/=sent/{a[$2]=$0;}/=receive/{print a[$2], $0;delete a[$2];}END{for(i in a)print a[i],"NO MATCH";}' file

Results:

abc message=sent session:111,x,y,z pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z pqr message=receive session:123,4,5,7
abc message=sent session:589,x,y,z pqr message=receive session:589,4,5,7
abc message=sent session:342,x,y,z NO MATCH

When the send record is encountered, it is store in the array with the session id as the index. When the receive record is encountered, the send record is fetched from the array and printed along with receive record. Also, sent records are removed from array as and when receive records are received. At the END, all the remaining records in the array are printed as NO MATCH.

Guru
  • 16,456
  • 2
  • 33
  • 46
  • Thanks a lot for this.. but I am not able to understand the logic.. could you please explain it – Vipin Choudhary Feb 13 '13 at 06:54
  • Hello again Guru.. I have one more query.. is it possible to print output in a sequence in its order? for example NO MATCH should be print in between after session 123 – Vipin Choudhary Feb 14 '13 at 09:52
  • If I'll reprase it then it should be like this: For every "=sent", search for "=receive" in **immediate** NEXT LINE ONLY for same session number
    So merge only those sent and receive where we have matching session number ELSE print the sent as it is in a sequence
    – Vipin Choudhary Feb 14 '13 at 12:27