2

I am getting the right result for the awk program below. But I dont understand how does AWK process lines of code for the below program :

{
    for(i = 1; i <= NF; i++)
    {
    if (min[i]==""){ print "initial min " $i; min[i]=$i;} #line1
    if (max[i]==""){ print "initial max " $i; max[i]=$i;} #line2
    if ($i<min[i]) { print "New min " $i; min[i]=$i;}     #line3
    if ($i>max[i]) { print "New max " $i; max[i]=$i;}     #line4
    }
}
END
{
   OFS="\t";
   print "min","max";
   for(i = 1; i <= NF; i++)
   {
   print min[i],max[i];
   }     
}

Dataset fields are separated using space

0.4 1.4 2.4 3.4
0.3 1.3 2.3 3.3 
0.1 1.1 2.1 3.1
0.2 1.2 2.2 3.2
0.5 1.5 2.5 3.5

Output

initial min 0.4
initial max 0.4
initial min 1.4
initial max 1.4
initial min 2.4
initial max 2.4
initial min 3.4
initial max 3.4
New min 0.3
New min 1.3
New min 2.3
New min 3.3
New min 0.1
New min 1.1
New min 2.1
New min 3.1
New max 0.5
New max 1.5
New max 2.5
New max 3.5

min max
0.1 0.5
1.1 1.5
2.1 2.5
3.1 3.5

Line 1 and 2 are printed alternatively(that is, initial min and max) But line 3 and line 4 are executed after setting new Min or Max for all the Fields(or column) So how does awk really work?

Murlidhar Fichadia
  • 2,589
  • 6
  • 43
  • 93
  • 1
    `{}` blocks are executed on every line by default. The end block only executes when all lines have been processed. – 123 Jun 17 '16 at 14:08
  • But if you observe at the top line 1 and line 2 are executed alternatively for each record. min max for each field is set one after another rather than setting min for all fields first and then setting max for all fields. – Murlidhar Fichadia Jun 17 '16 at 14:17
  • 1
    That isn't what happens at all, every `if` runs on every field of every line. – 123 Jun 17 '16 at 14:19
  • 1
    The body of the first `for` loop is executed for record-1, column-1, and then for record-1, column-2; record-1, column-3; record-1, column-4; record-2, column-1, record-2, column-2, etc., exactly as seen in the output. Maybe if you include `NR` (the current record being processed) and `i` (the current column being looked at) in your diagnostic output it will be clearer. – jas Jun 17 '16 at 14:26
  • 1
    Use `print "initial min["i"] " $i` and `print "initial max["i"] " $i` instead and you'll understand by yourself – Camusensei Jun 17 '16 at 14:29

1 Answers1

3

I edited your code into:

{
    for(i = 1; i <= NF; i++)
    {
    if (min[i]==""){ print "initial min["i"] " $i; min[i]=$i;} #line1
    if (max[i]==""){ print "initial max["i"] " $i; max[i]=$i;} #line2
    if ($i<min[i]) { print "New min["i"] " $i; min[i]=$i;}     #line3
    if ($i>max[i]) { print "New max["i"] " $i; max[i]=$i;}     #line4
    }
}
END {
   OFS="\t";
   print "min","max";
   for(i = 1; i <= NF; i++)
   {
   print min[i],max[i];
   }     
}

Now, with its output you should understand what happens:

initial min[1] 0.3
initial max[1] 0.3
initial min[2] 3.3
initial max[2] 3.3
initial min[3] 0.5
initial max[3] 0.5
initial min[4] 3.6
initial max[4] 3.6
New max[1] 0.9
New max[2] 4.7
New max[3] 2.5
New min[4] 1.6
New min[1] 0.2
New min[2] 2.7
New max[3] 6.3
New max[4] 9.3
New min[2] 1.6
New max[3] 8.9
min     max
0.2     0.9
1.6     4.7
0.5     8.9
1.6     9.3
Camusensei
  • 1,475
  • 12
  • 20