0

I need to calculate the average of each column with a tcl script from a text file please help me

Frame Time     Elec     VdW         Nonbond     Total
0      0     -216.63   -16.0174    -232.647    -232.647 
1      1     -196.786  -28.6093    -225.395    -225.395 
2      2     -277.05   -23.924     -300.974    -300.974 
3      3     -203.854  -30.2473    -234.101    -234.101
mrcalvin
  • 3,291
  • 12
  • 18
  • What is the difference with your previous question ? https://stackoverflow.com/questions/67912382/calculate-average-of-columns-of-column-with-tcl – Mkn Jun 12 '21 at 11:55
  • In the first question I don't put a negatifs numbers, so when I put a table like this with a negatif numbers or a numbers with comma I find a problem and I can't resolve it – Mohamed Mastouri Jun 12 '21 at 13:27

1 Answers1

0

Once you start dealing with many columns, it gets easier to pull the whole file into memory. (This works well even for surprisingly large files.)

# Read the lines in
set f [open $filename]
set lines [split [string trim [read $f]] "\n"]
close $f

# Do an initial clean up of the data; \S+ matches non-whitespace
set headers [regexp -all -inline {\S+} [lindex $lines 0]]
set data [lmap line [lrange $lines 1 end] {regexp -all -inline {\S+} $line}]
# Properly we'd also validate the data to handle non-numeric junk, but this is just an example...

Now we can define a procedure to get the average of a column by name:

proc columnAverage {name} {
    global headers data
    # Look up which column it is
    set idx [lsearch -exact $headers $name]
    if {$idx < 0} {
        error "no such column \"$name\""
    }
    # Get the data from just that column
    set column [lmap row $data {lindex $row $idx}]
    # Calculate the mean of the column: sum / count
    return [expr {[tcl::mathop::+ {*}$column] / double([llength $column])}]
}

You'd call that like this:

puts "average of Elec is [columnAverage Elec]"
Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
  • With true fixed-width data, other techniques might need to be used to do the initial data cleanup, and with _lots_ of data it's probably best to load it into a(n SQLite) database instead of using text files to hold it; databases are much easier to analyse (well, if you know SQL that is). – Donal Fellows Jun 12 '21 at 13:33
  • This script is working very well thank you very much, I use a tcl script because I working with vmd (a bioinformatic sofware) and all rhe script in vmd are based on tcl script for that I search for a tcl script to modified same script in vmd – Mohamed Mastouri Jun 12 '21 at 18:44