4

I am reading David Beazley's Python Reference book and he makes a point:

For example, if you were performing a lot of square root operations, it is faster to use 'from math import sqrt' and 'sqrt(x)' rather than typing 'math.sqrt(x)'.

and:

For calculations involving heavy use of methods or module lookups, it is almost always better to eliminate the attribute lookup by putting the operation you want to perform into a local variable first.

I decided to try it out:

first()

def first():
    from collections import defaultdict
    x = defaultdict(list)

second()

def second():
    import collections
    x = collections.defaultdict(list)

The results were:

2.15461492538
1.39850616455

Optimizations such as these probably don't matter to me. But I am curious as to why the opposite of what Beazley has written comes out to be true. And note that there is a difference of 1 second, which is singificant given the task is trivial.

Why is this happening?

UPDATE:

I am getting the timings like:

print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');
user225312
  • 126,773
  • 69
  • 172
  • 181

6 Answers6

6

The from collections import defaultdict and import collections should be outside the iterated timing loops, since you won't repeat doing them.

I guess that the from syntax has to do more work that the import syntax.

Using this test code:

#!/usr/bin/env python

import timeit

from collections import defaultdict
import collections

def first():
    from collections import defaultdict
    x = defaultdict(list)

def firstwithout():
    x = defaultdict(list)

def second():
    import collections
    x = collections.defaultdict(list)

def secondwithout():
    x = collections.defaultdict(list)

print "first with import",timeit.timeit('first()', 'from __main__ import first');
print "second with import",timeit.timeit('second()', 'from __main__ import second');

print "first without import",timeit.timeit('firstwithout()', 'from __main__ import firstwithout');
print "second without import",timeit.timeit('secondwithout()', 'from __main__ import secondwithout');

I get results:

first with import 1.61359190941
second with import 1.02904295921
first without import 0.344709157944
second without import 0.449721097946

Which shows how much the repeated imports cost.

Douglas Leeder
  • 52,368
  • 9
  • 94
  • 137
  • If I place them outside, it defeats the entire purpose of my question! – user225312 May 08 '11 at 07:22
  • Aah, I see what you were trying to mean. – user225312 May 08 '11 at 07:59
  • 1
    @A A: You appear to be misunderstanding what Beazley wrote. He wasn't talking about the differences in repeated `import`s. It makes no sense to be arguing about the difference in times due to repeated imports. If you do move the `import`s outside the loop, what he says is true although in this trivial case, `first` is only slightly faster. – Ned Deily May 08 '11 at 08:02
  • @Ned: Hmm, ok. Which style is preferred then? – user225312 May 08 '11 at 08:15
  • 1
    @A A: For `import`s, either `import x` or `from x import y` is generally OK according to the Python Style Guide (PEP 8). As he writes, there may be reasons to prefer the latter when dealing with a large number of calls in loops. Some people frown on the latter due to potential namespace confusion. What is more widely frowned on is `from x import *`. The meta point is that you don't want to be repeatedly `import`ing for no reason. As PEP 8 says: "Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants." – Ned Deily May 08 '11 at 08:37
4

I'll get also similar ratios between first(.) and second(.), only difference is that the timings are in microsecond level.

I don't think that your timings measure anything useful. Try to figure out better test cases!

Update:
FWIW, here is some tests to support David Beazley's point.

import math
from math import sqrt

def first(n= 1000):
    for k in xrange(n):
        x= math.sqrt(9)

def second(n= 1000):
    for k in xrange(n):
        x= sqrt(9)

In []: %timeit first()
1000 loops, best of 3: 266 us per loop
In [: %timeit second()
1000 loops, best of 3: 221 us per loop
In []: 266./ 221
Out[]: 1.2036199095022624

So first() is some 20% slower than second().

eat
  • 7,440
  • 1
  • 19
  • 27
1

My guess, your test is biased and the second implementation gains from the first one already having loaded the module, or just from having it loaded recently.

How many times did you try it? Did you switch up the order, etc..

Mantas Vidutis
  • 16,376
  • 20
  • 76
  • 92
1

first() doesn't save anything, since the module must still be accessed in order to import the name.

Also, you don't give your timing methodology but given the function names it seems that first() performs the initial import, which is always longer than subsequent imports since the module must be compiled and executed.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
1

There is also the question of efficiency of reading/understanding the source code. Here's a real live example (code from a stackoverflow question)

Original:

import math

def midpoint(p1, p2):
   lat1, lat2 = math.radians(p1[0]), math.radians(p2[0])
   lon1, lon2 = math.radians(p1[1]), math.radians(p2[1])
   dlon = lon2 - lon1
   dx = math.cos(lat2) * math.cos(dlon)
   dy = math.cos(lat2) * math.sin(dlon)
   lat3 = math.atan2(math.sin(lat1) + math.sin(lat2), math.sqrt((math.cos(lat1) + dx) * (math.cos(lat1) + dx) + dy * dy))
   lon3 = lon1 + math.atan2(dy, math.cos(lat1) + dx)
   return(math.degrees(lat3), math.degrees(lon3))

Alternative:

from math import radians, degrees, sin, cos, atan2, sqrt

def midpoint(p1, p2):
   lat1, lat2 = radians(p1[0]), radians(p2[0])
   lon1, lon2 = radians(p1[1]), radians(p2[1])
   dlon = lon2 - lon1
   dx = cos(lat2) * cos(dlon)
   dy = cos(lat2) * sin(dlon)
   lat3 = atan2(sin(lat1) + sin(lat2), sqrt((cos(lat1) + dx) * (cos(lat1) + dx) + dy * dy))
   lon3 = lon1 + atan2(dy, cos(lat1) + dx)
   return(degrees(lat3), degrees(lon3))
Community
  • 1
  • 1
John Machin
  • 81,303
  • 11
  • 141
  • 189
0

Write your code as usual, importing a module and referencing its modules and constants as module.attribute. Then, either prefix your functions with the decorator for binding constants or bind all modules in your program by using the bind_all_modules function below:

def bind_all_modules():
    from sys import modules
    from types import ModuleType
    for name, module in modules.iteritems():
        if isinstance(module, ModuleType):
            bind_all(module)

def bind_all(mc, builtin_only=False, stoplist=[],  verbose=False):
    """Recursively apply constant binding to functions in a module or class.

    Use as the last line of the module (after everything is defined, but
    before test code).  In modules that need modifiable globals, set
    builtin_only to True.

    """
    try:
        d = vars(mc)
    except TypeError:
        return
    for k, v in d.items():
        if type(v) is FunctionType:
            newv = _make_constants(v, builtin_only, stoplist,  verbose)
            try: setattr(mc, k, newv)
            except AttributeError: pass
        elif type(v) in (type, ClassType):
            bind_all(v, builtin_only, stoplist, verbose)
tzot
  • 92,761
  • 29
  • 141
  • 204