awk to process the first two lines then the next two and so on

Question

Suppose i have a very file which i created from two files one is old & another is the updated file by using cat & sort on the primary key.

File1

102310863||7097881||6845193||271640||06007709532577||||
102310863||7097881||6845123||271640||06007709532577||||
102310875||7092992||6840808||023740||10034500635650||||
102310875||7092992||6840818||023740||10034500635650||||

So pattern of this file is line 1 = old value & line 2 = updated value & so on..

now I want to process the file in such a way that awk first process the first two lines of the file & find out the difference & then move on two the next two lines.

now the process is

if($[old record]!=$[new record])
    i= [new record]#[old record];

Desired output

102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||

score 2 · Accepted Answer · answered May 25 '15 at 07:33

$ cat tst.awk
BEGIN { FS="[|][|]"; OFS="||" }
NR%2 { split($0,old); next }
{
    for (i=1;i<=NF;i++) {
        if (old[i] != $i) {
            $i = $i "#" old[i]
        }
    }
    print
}
$
$ awk -f tst.awk file
102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||

anishsane · Answer 2 · 2015-05-25T08:21:04.957

1

This awk could help:

$ awk -F '\\|\\|' '{
       getline new;
       split(new, new_array, "\\|\\|");
       for(i=1;i<=NF;i++) {
           if($i != new_array[i]) {
               $i = new_array[i]"#"$i;
             }
          }
       } 1' OFS="||" < input_file

102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||

I think, you are good enough in awk to understand above code. Skipping the explanation.

edited May 25 '15 at 08:21

answered May 25 '15 at 06:54

anishsane

20,270
5
40
73

Similar idea to my own, but far more concise execution. Upvoted. – chw21 May 25 '15 at 06:59

chw21 · Answer 3 · 2015-05-25T06:57:34.867

0

Updated version, and thanks @martin for the double | trick:

$ cat join.awk
BEGIN   {new=0; FS="[|]{2}"; OFS="||"}
new==0  {
         split($0, old_data, "[|]{2}")
         new=1
         next
        }
new==1  {
         split($0, new_data, "[|]{2}")
         for (i = 1; i <= 7; i++) {
             if (new_data[i] != old_data[i]) new_data[i] = new_data[i] "#" old_data[i]
         }
         print new_data[1], new_data[2], new_data[3], new_data[4], new_data[5], new_data[6], new_data[7]
         new = 0
        }
$ awk -f join.awk data.txt
102310863||7097881||6845123#6845193||271640||06007709532577||||
102310875||7092992||6840818#6840808||023740||10034500635650||||

edited May 25 '15 at 06:57

answered May 25 '15 at 06:36

chw21

7,970
1
16
31

Use `awk -F'[|]{2}'`, I was just thinking about doing it similar to you, after I found out about the awk and `|` part. Alternatively, `awk -F'\\|\\|'` does also work. – martin May 25 '15 at 06:39
Instead of hard-coded number of fields, you can use the return value of `split` or simply `NF`... – anishsane May 25 '15 at 07:02

awk to process the first two lines then the next two and so on

3 Answers3