10

For instance, documentation says:

Note however that timeit will automatically determine the number of repetitions only when the command-line interface is used.

Is there a way to call it from within a Python script and have the number of repetitions be determined automatically, with only the shortest number returned?

endolith
  • 25,479
  • 34
  • 128
  • 192

2 Answers2

12

When you call timeit from the command line like this:

python -mtimeit -s'import test' 'test.foo()'

The timeit module is called as a script. In particular, the main function is called:

if __name__ == "__main__":
    sys.exit(main())

If you look at the source code, you'll see that the main function can take an args argument:

def main(args=None):    
    if args is None:
        args = sys.argv[1:]

So indeed it is possible to run timeit from within a program with exactly the same behavior as you see when run from the CLI. Just supply your own args instead of allowing it to be set to sys.argv[1:]:

import timeit
import shlex

def foo():
    total = 0
    for i in range(10000):
        total += i**3
    return total

timeit.main(args=shlex.split("""-s'from __main__ import foo' 'foo()'"""))

will print something like

100 loops, best of 3: 7.9 msec per loop

Unfortunately, main prints to the console, instead of returning the time per loop. So if you want to programmatically use the result, perhaps the easiest way would be to start by copying the main function and then modifying it -- changing the printing code to instead return usec.


Example by OP: If you place this in utils_timeit.py:

import timeit
def timeit_auto(stmt="pass", setup="pass", repeat=3):
    """
    http://stackoverflow.com/q/19062202/190597 (endolith)
    Imitate default behavior when timeit is run as a script.

    Runs enough loops so that total execution time is greater than 0.2 sec,
    and then repeats that 3 times and keeps the lowest value.

    Returns the number of loops and the time for each loop in microseconds
    """
    t = timeit.Timer(stmt, setup)

    # determine number so that 0.2 <= total time < 2.0
    for i in range(1, 10):
        number = 10**i
        x = t.timeit(number) # seconds
        if x >= 0.2:
            break
    r = t.repeat(repeat, number)
    best = min(r)
    usec = best * 1e6 / number
    return number, usec

you can use it in scripts like this:

import timeit
import utils_timeit as UT

def foo():
    total = 0
    for i in range(10000):
        total += i**3
    return total

num, timing = UT.timeit_auto(setup='from __main__ import foo', stmt='foo()')
print(num, timing)
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    Yes, that looks good, except that it should not be this function's job to insert `os.curdir` into `sys.path`. Either PYTHONPATH should be properly set up, or it should be the job of the calling script (not this function) to fuss with `sys.path`. – unutbu Sep 30 '13 at 14:50
  • Your example doesn't work for me (`ImportError: cannot import name foo`), but `UT.timeit_auto(lambda: foo())` does, producing similar output to IPython's `timeit foo()`. – endolith Oct 04 '13 at 00:12
  • @endolith: Oh but lambda adds some overhead, so not ideal. `lambda: 5*5` is 86 ns while `5*5` is 21 ns. – endolith Oct 04 '13 at 02:59
  • @endolith: Are you calling the script from the command line? e.g. `python /path/to/script.py`? – unutbu Oct 04 '13 at 10:25
  • No, I was using runfile() in Spyder to launch it. You're right, if I run it on the command line it works. Also, I just realized that IPython [has its own slightly different implementation](https://github.com/ipython/ipython/blob/master/IPython/core/magics/execution.py#L982), which is why IPython will only do 1 loop for very slow functions, while the minimum with timeit.main() is 10 loops. – endolith Oct 04 '13 at 14:39
  • IDEs may mess with `__main__` which is probably why it does not work from Spyder. Multi-threaded and GUI programs can also have problems when run from within IDEs. Sometimes it appears to be necessary to run from the CLI. – unutbu Oct 04 '13 at 17:56
  • I would like an approach where I dont have to define the setup for each problem. Is there someway to import everything from main? I tried setup='from __main__ import __all__' and setup = 'from __main__ import *' but both dont work. Even better is ofcourse an approach were it is recognized what should be imported. E.g. if I have a dataframe and want to time df[variable_name].to_numpy() I dont want to explicitly mention in the setup that df and variable_name have to be imported – Rens Aug 19 '21 at 19:34
  • Related topic: https://stackoverflow.com/questions/68898318/custom-timeit-from-main-import-everything-necessary . – Rens Aug 25 '21 at 20:15
3

As of Python 3.6, timeit.Timer objects have an autorange function that exposes how number is determined for command line execution.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • 6
    Dear mad Physicist, could you maybe give an example of how to implement that in above timer? Also, do you maybe know the answer to the following question? https://stackoverflow.com/questions/68898318/custom-timeit-from-main-import-everything-necessary – Rens Aug 25 '21 at 20:15