Average over diagonally in a Matrix

Question

I have a matrix. e.g. 5 x 5 matrix

$ cat input.txt
1       5.6        3.4     2.2     -9.99E+10
2       3          2       2       -9.99E+10
2.3     3          7       4.4     5.1
4       5          6       7       8
5       -9.99E+10  9       11      13

Here I would like to ignore -9.99E+10 values.

I am looking for average of all entries after dividing diagonally. Here are four possibilities (using 999 in place of -9.99E+10 to save space in the graphic):

I would like to average over all the values under different shaded triangles. So the desire output is:

$cat outfile.txt
P1U  3.39    (Average of all values of Lower side of Possible 1 without considering -9.99E+10) 
P1L  6.88    (Average of all values of Upper side of Possible 1 without considering -9.99E+10)
P2U  4.90
P2L  5.59   
P3U  3.31
P3L  6.41
P4U  6.16
P4L  4.16

It is being difficult to develop a proper algorithm to write it in fortran or in shell script. I am thinking of the following algorithm, but can't able to think what is next.

step 1: #Assign -9.99E+10 to the Lower diagonal values of a[ij]
      for i in {1..5};do
        for j in {1..5};do
           a[i,j+1]=-9.99E+10
         done
      done
 step 2: #take the average
      sum=0
      for i in {1..5};do
        for j in {1..5};do
          sum=sum+a[i,j]
        done
       done
  printf "%s %5.2f",P1U, sum
  step 3: #Assign -9.99E+10 to the upper diagonal values of a[ij]
      for i in {1..5};do
        for j in {1..5};do
           a[i-1,j]=-9.99E+10
         done
      done
 step 4: #take the average
      sum=0
      for i in {1..5};do
        for j in {1..5};do
          sum=sum+a[i,j]
        done
       done
  printf "%s %5.2f",P1L,sum

Welcome to Stack Overflow. SO is a question and answer site for professional and enthusiast programmers. The goal is that you add some code of your own to your question to show at least the research effort you made to solve this yourself. — Cyrus, Apr 15 '20 at 12:09
I have done many research, but can't able to do it. I have updated my code. Thank you. — Kay, Apr 15 '20 at 12:18
Are you familiar with python/numpy and open to such a solution by any chance ? — luciole75w, Apr 15 '20 at 19:24

Ed Morton · Answer 1 · 2020-04-15T23:26:57.433

Just save all the values in an aray indexied by row and column number and then in the END section repeat this process of setting the beginning and end row and column loop delimiters as needed when defining the loops for each section:

$ cat tst.awk
{
    for (colNr=1; colNr<=NF; colNr++) {
        vals[colNr,NR] = $colNr
    }
}
END {
    sect = "P1U"
    begColNr = 1; endColNr = NF; begRowNr = 1; endRowNr = NR
    sum = cnt = 0
    for (rowNr=begRowNr; rowNr<=endRowNr; rowNr++) {
        for (colNr=begRowNr; colNr<=endColNr-rowNr+1; colNr++) {
            val = vals[colNr,rowNr]
            if ( val != "-9.99E+10" ) {
                sum += val
                cnt++
            }
        }
    }
    printf "%s %.2f\n", sect, (cnt ? sum/cnt : 0)

    sect = "P1L"
    begColNr = 1; endColNr = NF; begRowNr = 1; endRowNr = NR
    sum = cnt = 0
    for (rowNr=begRowNr; rowNr<=endRowNr; rowNr++) {
        for (colNr=endColNr-rowNr+1; colNr<=endColNr; colNr++) {
            val = vals[colNr,rowNr]
            if ( val != "-9.99E+10" ) {
                sum += val
                cnt++
            }
        }
    }
    printf "%s %.2f\n", sect, (cnt ? sum/cnt : 0)
}

.

$ awk -f tst.awk file
P1U 3.39
P1L 6.88

I assume given the above for handling the first quadrant diagonal halves you'll be able to figure out the other quadrant diagonal halves and the horizontal/vertical quadrant halves are trivial (just set begRowNr to int(NR/2)+1 or endRowNr to int(NR/2) or begColNr to int(NF/2)+1 or endColNr to int(NF/2) then loop through the resultant full range of values of each).

karakfa · Answer 2 · 2020-04-16T15:24:54.417

you can compute all in one iteration

$ awk -v NA='-9.99E+10' '{for(i=1;i<=NF;i++) a[NR,i]=$i} 
          END {for(i=1;i<=NR;i++) 
                 for(j=1;j<=NF;j++) 
                    {v=a[i,j]; 
                     if(v!=NA) 
                       {if(i+j<=6) {p["1U"]+=v; c["1U"]++} 
                        if(i+j>=6) {p["1L"]+=v; c["1L"]++} 
                        if(j>=i)   {p["2U"]+=v; c["2U"]++} 
                        if(i<=3)   {p["3U"]+=v; c["3U"]++} 
                        if(i>=3)   {p["3D"]+=v; c["3D"]++} 
                        if(j<=3)   {p["4U"]+=v; c["4U"]++} 
                        if(j>=3)   {p["4D"]+=v; c["4D"]++}}} 
                 for(k in p) printf "P%s %.2f\n", k,p[k]/c[k]}' file  | sort

P1L 6.88
P1U 3.39
P2U 4.90
P3D 6.41
P3U 3.31
P4D 6.16
P4U 4.16

I forgot to add P2D, but from the pattern it should be clear what needs to be done.

To generalize further as suggested. Assert NF==NR, otherwise diagonals not well defined. Let n=NF (and n=NR) You can replace 6 with n+1 and 3 with ceil(n/2). Which can be implemented as function ceil(x) {return x==int(x)?x:x+1}

Average over diagonally in a Matrix

2 Answers2