-1

I have 2 large files i need to compare all pipe delimited

file 1

a||d||f||a
1||2||3||4

file 2

a||d||f||a
1||1||3||4
1||2||r||f

Now I want to compare the files & print accordingly such as if any update found in file 2 will be printed as updated_value#oldvalue & any new line added to file 2 will also be updated accordingly.

So the desired output is: (only the updated & new data)

1||1#2||3||4
1||2||r||f

what I have tried so far is to get the separated changed values:

awk -F '[||]+' 'NR==FNR{for(i=1;i<=NF;i++)a[NR,i]=$i;next}{for(i=1;i<=NF;i++)if(a[FNR,i]!=$i)print $i"#"a[FNR,i]}' file1 file2 >output

But I want to print the whole line. How can I achieve that??

bongboy
  • 147
  • 1
  • 15
  • it is a bit unclear what you pretend. Also, it is worth indicating what did you try. – fedorqui May 18 '15 at 10:31
  • sure let me update my answer – bongboy May 18 '15 at 10:32
  • so you are just comparing the common lines? that is, line 1 on file1 with line 1 on file2, etc. – fedorqui May 18 '15 at 10:39
  • yes, can u please help me how to do it... – bongboy May 18 '15 at 10:46
  • What is this behaviour? I answer your question, you accept it and then just unaccept with any kind of explanation. If you have a new question, ask a new question. If you weren't consistent in your requirements, specify so and provide some feedback. – fedorqui May 18 '15 at 12:41
  • i'm extremly sorry.. i'm now to stackoverflow.. even though i asked a couple of questions .. i guess by mistake the answer got unchecked.. – bongboy May 18 '15 at 12:43
  • @fedorqui can u please review the changed required? – bongboy May 18 '15 at 12:46
  • I don't understand what is the logic you expect here. How can I know if a line is "new"? Is there any specific pattern to reflect that? How do you "explain" it in an algorithmic way? – fedorqui May 18 '15 at 13:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/78091/discussion-between-bongboy-and-fedorqui). – bongboy May 18 '15 at 13:09

1 Answers1

1

I would say:

awk 'BEGIN{FS=OFS="|"}
     FNR==NR {for (i=1;i<=NF;i+=2) a[FNR,i]=$i; next}
     {for (i=1; i<=NF; i+=2)
         if (a[FNR,i] && a[FNR,i]!=$i)
             $i=$i"#"a[FNR,i]
     }1' f1 f2

This stores the file1 in a matrix a[line number, column]. Then, it compares its values with its correspondence in file2.

Note I am using the field separator | instead of || and looping in steps of two to use the proper data. This is because I for example did gawk -F'||' '{print NF}' f1 and got just 1, meaning that FS wasn't well understood. Will be grateful if someone points the error here!

Test

$ awk 'BEGIN{FS=OFS="|"} FNR==NR {for (i=1;i<=NF;i+=2) a[FNR,i]=$i; next} {for (i=1; i<=NF; i+=2) if (a[FNR,i] && a[FNR,i]!=$i) $i=$i"#"a[FNR,i]}1' f1 f2
a||d||f||b#a
1||1#2||3||4
1||2||r||f
fedorqui
  • 275,237
  • 103
  • 548
  • 598