finding min and maximum in a daughter file and relating that result to the parent file

Question

I have an input file like below.

element  materl(local) 
ipt-shl  stress       sig-xx      sig-yy      sig-zz      sig-xy      sig-yz      sig-zx       plastic
       state                                                                                 strain 
1346995-     25
1-  2 elastic   5.9309E-01 -1.0920E-02  0.0000E+00  2.4431E-04  2.3158E-03  1.0608E-03    7.4616E-02
2-  2 elastic   6.1335E-01 -9.1746E-03  0.0000E+00 -4.2870E-04  2.3158E-03  1.0608E-03    7.4089E-02
3-  2 elastic   6.4586E-01 -7.3146E-03  0.0000E+00 -1.2961E-03  2.3158E-03  1.0608E-03    7.3794E-02
4-  2 elastic   6.7056E-01 -1.5564E-03  0.0000E+00 -1.0469E-03  2.3158E-03  1.0608E-03    7.3682E-02
5-  2 elastic   6.7493E-01  7.1420E-03  0.0000E+00  1.7934E-03  2.3158E-03  1.0608E-03    7.3708E-02
6-  2 elastic   6.7828E-01  1.4787E-02  0.0000E+00  5.4871E-03  2.3158E-03  1.0608E-03    7.3825E-02
7-  2 elastic   6.8092E-01  1.9656E-02  0.0000E+00  8.2580E-03  2.3158E-03  1.0608E-03    7.4210E-02
1346996-     25
1-  2 elastic   6.0586E-01 -4.6476E-03  0.0000E+00  9.4464E-03 -1.9585E-03 -5.1396E-03    7.4299E-02
2-  2 elastic   6.2548E-01 -5.1646E-03  0.0000E+00  6.3450E-03 -1.9585E-03 -5.1396E-03    7.4147E-02
3-  2 elastic   6.5631E-01 -5.3780E-03  0.0000E+00  1.1554E-03 -1.9585E-03 -5.1396E-03    7.4000E-02
4-  2 elastic   6.7186E-01 -1.5611E-03  0.0000E+00 -3.7045E-03 -1.9585E-03 -5.1396E-03    7.3999E-02
5-  2 elastic   6.7481E-01  5.1501E-03  0.0000E+00 -7.2939E-03 -1.9585E-03 -5.1396E-03    7.4107E-02
6-  2 elastic   6.7769E-01  1.1733E-02  0.0000E+00 -1.0146E-02 -1.9585E-03 -5.1396E-03    7.4238E-02
7-  2 elastic   6.7946E-01  1.5462E-02  0.0000E+00 -1.1218E-02 -1.9585E-03 -5.1396E-03    7.4362E-02

and so on.

What I am trying to do is to select only the column under plastic strain , put it to another file and then to find the minimum and maximim out of it. The problem is that when I shift to another file I loose the identity of maximum of minimum which is at the top of 7 lines which is the element number. I used

awk '{ print $10 }' elout > Plastic.k    # Shifting the required field to another file
sed -i -e '/^$/d' Plastic.k              # removing all the empty lines 
sed  -n '/^[0-9]\{1\}/p' Plastic.k > tt  # removing all lines with the first letter alphasbet. 
mv tt Plastic.k

Now I have to find the maximum and minimum out of this file Plastic.k and then to find the element number(identity) of that occurence in elout file, the original file.

Any suggestions ?

P.S. by identity I mean the 7 digit number on the top of a group of 7 lines followed by a - symbol-

The output would be of the form

min=7.3682E-02 at 1346995-25
max=7.4616E-02 at 1346995-25

It would not 1346996-25 as it neither have the minimum nor the maximum at the field 10. I have such a data from an input file and want to get output in an output file.

If this input format is a little changed like as follows , the answer from Potong donesnt work. I tried a lot to understand but could not. The new input is as follows.

Its like same.

element  materl(local)   
ipt-shl  stress       sig-xx      sig-yy      sig-zz      sig-xy      sig-yz      sig-zx       plastic
state                                                                                 strain
699425-     13
1- 16 elastic   4.9281E-01  5.9754E-02  0.0000E+00 -2.7210E-02  1.4192E-02  1.2603E-01    1.7112E-02
2- 16 elastic   4.6965E-01  4.8664E-02  0.0000E+00 -2.1255E-02  1.4192E-02  1.2603E-01    1.2814E-02
3- 16 elastic   4.3029E-01  2.6264E-02  0.0000E+00 -7.2280E-03  1.4192E-02  1.2603E-01    7.1400E-03
4- 16 elastic   3.1283E-01 -1.4079E-02  0.0000E+00  1.3315E-02  1.4192E-02  1.2603E-01    1.9514E-03
5- 16 elastic  -3.4911E-01 -2.9740E-02  0.0000E+00  3.7036E-02  1.4192E-02  1.2603E-01    7.5132E-04
6- 16 elastic  -4.5764E-01 -7.0891E-02  0.0000E+00  3.6667E-02  1.4192E-02  1.2603E-01    7.1070E-03
7- 16 elastic  -4.8788E-01 -8.1926E-02  0.0000E+00  4.1023E-02  1.4192E-02  1.2603E-01    1.1321E-02
699426-     13
1- 16 elastic   3.5073E-01  6.2039E-03  0.0000E+00 -9.4607E-02 -3.4023E-03 -2.4265E-02    1.4478E-02
2- 16 elastic   3.5540E-01  8.6871E-03  0.0000E+00 -7.2062E-02 -3.4023E-03 -2.4265E-02    1.0498E-02
3- 16 elastic   3.6224E-01  7.2871E-03  0.0000E+00 -3.5263E-02 -3.4023E-03 -2.4265E-02    6.1994E-03
4- 16 elastic   2.3782E-01 -1.7772E-02  0.0000E+00  5.9101E-03 -3.4023E-03 -2.4265E-02    1.6298E-03
5- 16 elastic  -2.3065E-01 -3.2565E-02  0.0000E+00  6.0890E-02 -3.4023E-03 -2.4265E-02    1.3029E-03
6- 16 elastic  -3.0923E-01 -3.0984E-02  0.0000E+00  9.0408E-02 -3.4023E-03 -2.4265E-02    5.3680E-03
7- 16 elastic  -3.3606E-01 -2.5992E-02  0.0000E+00  1.0568E-01 -3.4023E-03 -2.4265E-02    9.3878E-03

The only difference is that in this output we have 16 instead of 2 in fornt of the numerbs 1 to 7.

Please suggest me the correction.

if you can suggest a better way of doing this in a single awk or sed , it would be more than welcomed :) regards — hamad khan, Feb 22 '12 at 13:27
I have an idea , like finding the minimum and maximum directly from this data , the going to $1 of the same record on which minimum or maximum is found. Then going the same number of lines above that line and again saving the $1 of that line , which would be for sure the identity but I lack the knowledge to do this on linux. — hamad khan, Feb 22 '12 at 13:35

potong · Accepted Answer · 2012-02-23T13:49:19.747

2

This might work for you:

sed '/^\([0-9]\{7\}\).*/,+7!d;//{s//\1/;h;d};s/.* //;G;s/^\(.*\)\n\(.*\)/\2 \1/' file |
sort -g |
sed 'h;N;N;N;N;N;N;s/.*\n//;H;g;s/\n\S*//'
1346995 7.3682E-02 7.4616E-02
1346996 7.3999E-02 7.4362E-02

EDIT:

With reference to comments below and requested output shown in amended question, here is an amended solution:

sed '/^\([0-9]\{7\}-\)\s*\([0-9]*\).*/,+7!d;//{s//at \1\2/;h;d};s/.* //;G;s/\n/ /' file| 
sort -g | 
sed '1s/^/min=/p;$s/^/max=/p;d'
min=7.3682E-02 at 1346995-25
max=7.4616E-02 at 1346995-25

edited Feb 23 '12 at 13:49

answered Feb 22 '12 at 14:16

potong

55,640
6
51
83

This is quite interesting . However what you have written gives the minimum and maximum in all set of 7 lines in the field 10. What I request is to give on 1 maximum and 1 minimum value for the complete file from field 10 and only the identity of those to be printed. I thank you for the effort. Further also please tell me how to pipe the output, I tried > result.txt in the last command but nothing was shown up. regards – hamad khan Feb 23 '12 at 07:36
I have also added the required output to the question. – hamad khan Feb 23 '12 at 08:07
@hamaskhan I have amended the solution (see above) as to piping the output `> result.txt` will send the results to `result.txt` provided `result.txt` is not used in any of the commands in the pipe line. – potong Feb 23 '12 at 09:40
I tried you solution but it seems not to be working I tried to pipe out the output but also nothing printed unfortunately... May be you can check it again. It also gave an error like >> **bold** `sed: read error on /: Is a directory` – hamad khan Feb 23 '12 at 10:52
I changed it to the file , now it runs but no output , neither on console nor in the file. – hamad khan Feb 23 '12 at 11:58
@hamaskhan sorry an extra `/` crept in from my debuging. Try the corrected version. If this fails what version of sed and sort are you using? I am using GNU sed version 4.1.5 and sort (GNU coreutils) 6.9 and the data provided. – potong Feb 23 '12 at 13:53
with a little changes like removing the leading white spaces , it worked great. I sed the file between the required keywords and the used your code. It was perfect. regards. @ potong – hamad khan Feb 27 '12 at 11:57
@potong , this is now not working.. As the format of the file is changed , it stopped working. – hamad khan Jun 19 '12 at 10:26

score 1 · Answer 2 · edited May 23 '17 at 12:27

1

here's solution: Sorting Scientific Number With Unix Sort so use this:

cat Plastic.k | awk '{ print $10 }' | sed -ne'/^[0-9]\{1\}/p' | sort -g | sed -n -e'1p' -e'$p'

edited May 23 '17 at 12:27

Community

1
1

answered Feb 22 '12 at 13:45

Andrey Starodubtsev

5,139
3
32
46

Hi, It gives the minumum and maximum , thats fine, but how can find the identity which is the major problem? like going to first field of the record containing maximum , parsing that number and going that many lines above and then output the first field of that line to find the identity. I thank you for your kind effort, regards @darkmist – hamad khan Feb 22 '12 at 13:55

kev · Answer 3 · 2012-02-22T13:56:30.570

$ cat input.txt | awk 'NR<4{next}; NF==2{id=$1}; NF==10{printf "%s %f\n",id+0,$10}' | sort -k1,1 -k2,2n | awk 'x!=$1{if(NR!=1)printf "%s\n\n",y;x=$1;print};{y=$0};END{print}'

Break into multi-lines: (`>` is bash prompt):

$ cat input.txt |
> awk 'NR<4{next}; NF==2{id=$1}; NF==10{printf "%s %f\n",id+0,$10}' |
> sort -k1,1 -k2,2n |
> awk 'x!=$1{if(NR!=1)printf "%s\n\n",y;x=$1;print};{y=$0};END{print}'

Result:

1346995 0.073682
1346995 0.074616

1346996 0.073999
1346996 0.074362

Explanation:

NR<4{next} skip first 3 lines
NF==2{id=$1} keep track of current group id
NF==10{printf...$10} print both id and value of column#10
sort -k1,1 -k2,2n sort by column#1 and column#2
awk 'x!=$1...print} print last group's last line before print current group's first line
{y=$0} keep track of last line
END{print} print last line

score 1 · Answer 4 · answered Feb 22 '12 at 14:48

rrr... can't insert code block in comment =( if i undestood you right, you need numbers from first column, which correspond to minimum & maximum values from 10th column, right? Than you can use following script:

#!/bin/bash
minAndMax="`cat Plastic.k | awk '{ print $10 }' | sed -ne'/^[0-9]\{1\}/p' | sort -g | sed -n -e'1p' -e'$p'`"
min="`echo \"$minAndMax\" | head -n 1`"
max="`echo \"$minAndMax\" | tail -n 1`"
minIDs="`cat Plastic.k | awk \"\\\$10 == $min { print \\\$1 }\" | sed -e's/-$//'`"
maxIDs="`cat Plastic.k | awk \"\\\$10 == $max { print \\\$1 }\" | sed -e's/-$//'`"
echo "\$minIDs==$minIDs"
echo "\$maxIDs==$maxIDs"

shellter · Answer 5 · 2012-02-24T03:03:21.383

#!/bin/bash   

cat - <<-EOD > testMinMaxData.txt
1346995-     25
1-  2 elastic   5.9309E-01 -1.0920E-02  0.0000E+00  2.4431E-04  2.3158E-03  1.0608E-03    7.4616E-02
2-  2 elastic   6.1335E-01 -9.1746E-03  0.0000E+00 -4.2870E-04  2.3158E-03  1.0608E-03    7.4089E-02
3-  2 elastic   6.4586E-01 -7.3146E-03  0.0000E+00 -1.2961E-03  2.3158E-03  1.0608E-03    7.3794E-02
4-  2 elastic   6.7056E-01 -1.5564E-03  0.0000E+00 -1.0469E-03  2.3158E-03  1.0608E-03    7.3682E-02
5-  2 elastic   6.7493E-01  7.1420E-03  0.0000E+00  1.7934E-03  2.3158E-03  1.0608E-03    7.3708E-02
6-  2 elastic   6.7828E-01  1.4787E-02  0.0000E+00  5.4871E-03  2.3158E-03  1.0608E-03    7.3825E-02
7-  2 elastic   6.8092E-01  1.9656E-02  0.0000E+00  8.2580E-03  2.3158E-03  1.0608E-03    7.4210E-02
1346996-     25
1-  2 elastic   6.0586E-01 -4.6476E-03  0.0000E+00  9.4464E-03 -1.9585E-03 -5.1396E-03    7.4299E-02
2-  2 elastic   6.2548E-01 -5.1646E-03  0.0000E+00  6.3450E-03 -1.9585E-03 -5.1396E-03    7.4147E-02
3-  2 elastic   6.5631E-01 -5.3780E-03  0.0000E+00  1.1554E-03 -1.9585E-03 -5.1396E-03    7.4000E-02
4-  2 elastic   6.7186E-01 -1.5611E-03  0.0000E+00 -3.7045E-03 -1.9585E-03 -5.1396E-03    7.3999E-02
5-  2 elastic   6.7481E-01  5.1501E-03  0.0000E+00 -7.2939E-03 -1.9585E-03 -5.1396E-03    7.4107E-02
6-  2 elastic   6.7769E-01  1.1733E-02  0.0000E+00 -1.0146E-02 -1.9585E-03 -5.1396E-03    7.4238E-02
EOD

if ${testingMode:-true} ; then
  set -- testMinMaxData.txt
fi

awk '
   NF==2{gsub(/[  ]*/,"",$0); header=$0}
   NF==10{print header "\t" $10}
' "${@:?Usage:$0 file1 [file2 ....]}" \
| awk '{
    hd=$1
    if (! (hd in hdrs)) {
      hdrs[hd]=++i ; hdrsVal[i]=hd; min[hd]=999999; max[hd]=0.000000009 ;
      #dbg print "#dbg:added " hd " to hdrs"
    }
    #dbg print "#dbg:$2=" $2 "\tmin["hd"]=" min[hd] "\tmax["hd"]="max[hd]
    if ($2 < min[hd]) {
      min[hd]=$2
      #dbg print "#dbg:added "$2" to min["hd"]"
    }
    if ($2 > max[hd]+0.0) {
      max[hd]=$2
      #dbg print "#dbg:added "$2" to max["hd"]"
    }
}
END {
   #dbg for (x in hdrs) print "hdrs["x"]=" hdrs[x]
    for (j=1;j<=i;j++) {
      print hdrsVal[j] "\t" min[hdrsVal[j]] "\t" max[hdrsVal[j]]
    }
  }
'\
| awk 'BEGIN{
  minVal=9999999999
  maxVal=.000000009
  }
  {
    if ($2 < minVal) {
        minVal=$2 ; minTag=$1
        #dbg print "#dbg:added "$2" to min["hd"]"
    }
    if ($3 > maxVal) {
        maxVal=$3  ; maxTag=$1
        #dbg print "#dbg:added "$2" to max["hd"]"
    }
  }
  END {
   print "min=" minVal " at " minTag
   print "max=" maxVal " at " maxTag
  }
'

output

min=7.3682E-02 at 1346995-25
max=7.4616E-02 at 1346995-25

This script is a self-contained proof-of-concept test suite. For real usage, I would recommend deleting both of the following 'blocks' of code, and leave only the 3 awks in your working script file.

The cat ... > testMin...EOD block creates your sample data into a test file.

The if ${testingMode:-true}... block uses the shell feature of set -- arg1 arg2 ... to set positional parameters. This value is then expanded as the shell parameter "${@}" that you see at the end of the first awk program (just before the pipe char ('|')).

I have also embedded a usage statment into the evalution of "${@?Usage:$0 file1 [file2 ...]}". If no filenames are supplied, the script gives you a simple error/usage message.

I have left the debugging statements in, you can remove the '#' char in front to see how data is being processed as it goes through the script.

Note that awk associative arrays hdrs[hd]=++i ; hdrsVal[i]=hd; etc., are not always intuitvely obvious to the new awk user. BUT awk associative arrays are one of the languages most powerful features. They are definitely worth your time experimenting with to understand how they work. Turn on some of the debugging lines to see what values are getting sorted where.

The only reason I keep the arr hdrs[hd] is so at the end, we can enumerate through the array by numeric key (1,2,3,...) which means the data will be printed in the order it was read in, and by using the value returned by hdrs[2]=1346995-25, then we can lookup min and max values via min[1346995-25], max[1346995-25].

Finally, as you data looks to be engineering data, you may find further help looking at the links at awk.info--Engineering

Edits

I have added the final distilation to just 1 min and max value with the setID.

You wrote

How can I add an input file name and an output file name.

When you edit script as I have mentioned above, you need to save the file.

Then from the Unix/Linux/Cygwin cmd-line, you need to 'mark' the file so the O.S. knows it is meant to be executable.

chmod 755 ./myMinMaxFinder.sh

Now, you can execute the cmd like

 ./myMinMaxFinder.sh file1 [file2 .... filen] > myOUTPUT.FILE

This is the standard way of creating output files in Unix. Argument processing will be a consulting fee ;-)

I mentioned awk.info above. As you're a mechanical engineer, be sure to check out

http://awk.info/?doc/mecheng.html

This also points to another website, done by a mechanical engineer

http://www.tikmark.com/awkeng/awkscripts.html

The design I'm using here is a traditional Unix pipeline. Each awk section solves one part of the puzzle. You can disconnect any section (by inserting 2 blank lines and adding exit to see what each stage of the script is doing.

For more general info on using awk, see this the Grymoire's most Excellent Awk Tutorial

I hope this helps.

@shelter , thanks for you comprehensive reply, being a mechanical engineer , it is a bit difficult to understand this hardcore code, however I have a few questions. How can I add an input file name and an output file name. Secondly I am only interested in an overall maximum and minimum in the complete file at field 10 and not per 7 rows. After finding the overall maximum and minimum , only then I need to print the IDs. I mean in a 1000 element file I want only 1 maximum and 1 minimum and the ids of only those 2 rather 1000 maximim and minimum for each of element. I am very thankful. — hamad khan, Feb 23 '12 at 07:54
I have edited the question with the desired output, thanks in advanced — hamad khan, Feb 23 '12 at 08:07
@hamaskhan : see my extended solution and additional comments under edits. Good luck. — shellter, Feb 23 '12 at 16:54

finding min and maximum in a daughter file and relating that result to the parent file

5 Answers5

Break into multi-lines: (> is bash prompt):

Result:

Explanation:

Break into multi-lines: (`>` is bash prompt):