2

I am performing a tiny speed test in order to compare the speed of the Agda programming language with the Tcl scripting language. Its for scientific work and this is just a pre-test, not a real test. I am not in anyway trying to perform a realistic speed comparison!

I have come up with a small example, in which Agda is 10x times faster than Tcl. There are special reasons I use this example. My main concern is that my Tcl code is badly programmed and this is the sole reason Tcl is slower than Agda in this example.

The goal of the code is to parse a line that represents a list of integers and check if it is indeed a list of integers.
Example "(1,2,3)" would be a valid list.
Example "(1,a,3)" would not be a valid list.

My input is a file and I check every third line (3rd) of the file. If any line is not a list of integers, the program prints "false".

My input file:

(613424,505980,317647,870930,75580,897160,716297,668539,689646,196362,533020)


(727375,472272,22435,869407,320468,80779,302881,240382,196077,635360,568517)


(613424,505980,317647,870930,75580,897160,716297,668539,689646,196362,533020)

(however, my real test file is about 3 megabyte large)

My current Tcl code to solve this problem is:

package require Tcl 8.6

proc checkListNat {str} {
    set list [split [string map {"(" "" ")" ""} $str] ","]
    foreach l $list {
        if {[string is integer $l] == 0} {
            return 0
        }
    }
    return 1
}

set i 1
set fp [open "/tmp/test.txt" r]
while { [gets $fp data] >= 0 } {
    incr i 
    if { [expr $i % 3] == 0} {
        if { [checkListNat $data] == 0 } {
            puts "error"
        }
    }
}
close $fp

How can I optimize my current Tcl code, so that the speed test between Agda and Tcl is more realistic?

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
mrsteve
  • 4,082
  • 1
  • 26
  • 63

2 Answers2

2

The first thing to do is to put as much code in procedures (or lambda terms) as possible and ensure that all expressions are braced. Those were your two key problems that were killing performance. We'll do a few other things too (you hardly ever need expr inside an if test and this wasn't one of those cases, string trim is more suitable than string map, string is really ought to be done with -strict). With those, I get this version which is relatively similar to what you already had yet ought to be substantially more performant.

package require Tcl 8.6

proc checkListNat {str} {
    foreach l [split [string trim $str "()"] ","] {
        if {[string is integer -strict $l] == 0} {
            return 0
        }
    }
    return 1
}

apply {{} {
    set i 1
    set fp [open "/tmp/test.txt" r]
    while { [gets $fp data] >= 0 } {
        if {[incr i] % 3 == 0 && ![checkListNat $data]} {
            puts "error"
        }
    }
    close $fp
}} {*}$argv

You might get better performance by adding fconfigure $fp -encoding iso8859-1; you'll have to test that yourself. But the key changes are the ones due to the bold items earlier, as each substantially impacts on the efficiency of compilation strategy used. (Also, Tcl 8.5 is a little faster than 8.6 — 8.6 has a radically different execution engine that is a bit slower for some things — so you might test the new code with 8.5 too; the code itself appears to be valid with both versions.)

Donal Fellows
  • 133,037
  • 18
  • 149
  • 215
1

try checking with regex {^[0-9,]+$} $line instead of the checkListNat function.

update here is an example

echo "87,566, 45,67\n56,5r5,45" >! try

...

while {[gets $fp line] >0} {
 if {[regexp {^[0-9]+$} $line] >0 } {
  puts "OK $line"
 } else {
  puts "BAD $line"
 }
}

gives:

>OK 87,566, 45,67

>BAD 56,5r5,45

user2141046
  • 862
  • 2
  • 7
  • 21