3

I am writing a awk script that takes some columns of input in a text file and print out the largest value in each column

Input:

 $cat numbers
    10      20      30.3    40.5
    20      30      45.7    66.1
    40      75      107.2   55.6
    50      20      30.3    40.5
    60      30      45.O    66.1
    70      1134.7  50      70
    80      75      107.2   55.6

Output:

80  1134.7  107.2       70

Script:

BEGIN {
val=0;
line=1;
}
{
if( $2 > $3 )
{
   if( $2 > val )
   {
      val=$2;
      line=$0;
   }
}
else
{
   if( $3 > val )
   {
      val=$3;
      line=$0;
   }
}
}
END{
print line
}

Current output:

 60 30  45.O    66.1

What am I doing wrong first awk script

=======SOLUTION======

 END {
  for (i = 0; ++i <= NF;)
   printf "%s", (m[i] (i < NF ? FS : RS))
   }
 {
 for (i = 0; ++i <= NF;)
   $i > m[i] && m[i] = $i
 }

Thanks for the help

BillPull
  • 6,853
  • 15
  • 60
  • 99
  • 1
    I don't understand the `$2 > $3` and `$3 > val` tests in your code; they don't seem at all related to the problem description (finding the largest value in each column). Try `awk '{print $1 $3;}' < numbers` to see what exactly the numbered variables mean. – sarnold Dec 09 '11 at 03:03
  • well $1 $2 and $3 are the column numbers – BillPull Dec 09 '11 at 03:06
  • 1
    So far so good. Why are you comparing the column numbers against each other? – sarnold Dec 09 '11 at 03:11

3 Answers3

1

Since you have four columns, you'll need at least four variables, one for each column (or an array if you prefer). And you won't need to hold any line in its entirety. Treat each column independently.

Kevin
  • 53,822
  • 15
  • 101
  • 132
1

You need to adapt something like the following for your purposes which will find the maximum in a particular column (the second in this case).

awk 'BEGIN {max = 0} {if ($2>max) max=$2} END {print max}' numbers.dat

The approach you are taking with $2 > $3 seems to be comparing two columns with each other.

Omar
  • 210
  • 1
  • 8
  • This just prints the largest value in the entire file not the max value for each column. – BillPull Dec 09 '11 at 03:20
  • This prints the largest value in column 2 of the file. So you need to generalize it to work for all the columns. – Omar Dec 09 '11 at 03:25
0

You can create one user defined function and then pass individual column arrays to it to retrieve the max value. Something like this -

[jaypal:~/Temp] cat numbers
10 20 30.3 40.5
20 30 45.7 66.1
40 75 107.2 55.6
50 20 30.3 40.5
60 30 45.O 66.1
70 1134.7 50.0 70
80 75 107.2 55.6

[jaypal:~/Temp] awk '             
function max(x){i=0;for(val in x){if(i<=x[val]){i=x[val];}}return i;} 
{a[$1]=$1;b[$2]=$2;c[$3]=$3;d[$4]=$4;next} 
END{col1=max(a);col2=max(b);col3=max(c);col4=max(d);print col1,col2,col3,col4}' numbers
80 1134.7 107.2 70

or

awk 'a<$1{a=$1}b<$2{b=$2}c<$3{c=$3}d<$4{d=$4} END{print a,b,c,d}' numbers
jaypal singh
  • 74,723
  • 23
  • 102
  • 147