Function closure performance

Question

I thought that I improve performance when I replace this code:

def f(a, b):
  return math.sqrt(a) * b
result = []
a = 100
for b in range(1000000):
  result.append(f(a, b))

with:

def g(a):
  def f(b):
    return math.sqrt(a) * b
  return f
result = []
a = 100
func = g(a)
for b in range(1000000):
  result.append(func(b))

I assumed that since a is fixed when the closure is performed, the interpreter would precompute everything that involves a, and so math.sqrt(a) would be repeated just once instead of 1000000 times.

Is my understanding always correct, or always incorrect, or correct/incorrect depending on the implementation?

I noticed that the code object for func is built (at least in CPython) before runtime, and is immutable. The code object then seems to use global environment to achieve the closure. This seems to suggest that the optimization I hoped for does not happen.

Actually that code won't run, you're calling `func` with only one argument, although it has 2 parameters. — Niklas B., Apr 02 '12 at 00:12

Niklas B. · Accepted Answer · 2012-04-02T01:36:27.553

13

I assumed that since a is fixed when the closure is performed, the interpreter would precompute everything that involves a, and so math.sqrt(a) would be repeated just once instead of 1000000 times.

That assumption is wrong, I don't know where it came from. A closure just captures variable bindings, in your case it captures the value of a, but that doesn't mean that any more magic is going on: The expression math.sqrt(a) is still evaluated every time f is called.

After all, it has to be computed every time because the interpreter doesn't know that sqrt is "pure" (the return value is only dependent on the argument and no side-effects are performed). Optimizations like the ones you expect are practical in functional languages (referential transparency and static typing help a lot here), but would be very hard to implement in Python, which is an imperative and dynamically typed language.

That said, if you want to precompute the value of math.sqrt(a), you need to do that explicitly:

def g(a):
  s = math.sqrt(a)
  def f(b):
    return s * b
  return f

Or using lambda:

def g(a): 
  s = math.sqrt(a)
  return lambda b: s * b

Now that g really returns a function with 1 parameter, you have to call the result with only one argument.

edited Apr 02 '12 at 01:36

answered Apr 02 '12 at 00:10

Niklas B.

92,950
18
194
224

The assumption was just my own thought: huh, something can clearly be optimized, so I guess it must be optimized. – max Apr 02 '12 at 00:15
1

CPython doesn't work that way. It's a thoroughly "pure" execution environment with no magic like that done. – Chris Morgan Apr 02 '12 at 00:18
@max: I added a paragraph on why such optimizations are not included (I won't say impossible) in CPython. Do you come from a functional background? Functional languages are more likely to be able to do such optimizations. – Niklas B. Apr 02 '12 at 00:19
@aaronasterling: That's exactly what I mean. Python has no mechanisms to declare the purity of a function, let alone derive it this property from the context. – Niklas B. Apr 02 '12 at 00:21
2

You can't tell anything about a Python program without running it. Best thing you can do is guess. Just imagine what setting `math.sqrt = lambda *args: input("Anything goes")` *anywhere* would do to the function. – Jochen Ritzel Apr 02 '12 at 00:29
@Jochen: Runtime analysis could still enable certain kinds of purity optimizations. Problem is that this would hardly be worth the huge effort it would mean to implement those for a dynamically typed, imperative language like Python. – Niklas B. Apr 02 '12 at 00:54
@NiklasB. I wish I could give +2 for both answering my question and providing a lot of other insight. I come from no background at all, I'm just learning this stuff; but I now understand why declaring or deriving purity of a function would be impractical in Python, and why purely functional languages can do things that hybrid languages normally can't. – max Apr 02 '12 at 01:34
1

@max: Very nice conclusion :) You should play with Haskell sometime ;) – Niklas B. Apr 02 '12 at 01:35
@JochenRitzel: your example reminded me that `math.sqrt` isn't a global name like I thought (whose unique value would be stored in `func.__globals__`). Rather, it's an expression that involves a global name `math` and a string literal `sqrt`. Clearly, nobody can guarantee what it evaluates to. – max Apr 02 '12 at 01:42
Unlike many compiled languages, you can safely assume that Python is *not* optimizing your code. – Elliot Cameron Jun 24 '14 at 12:51
1

@3noch It does certain types of peephole optimizations in the bytecode, like constant folding. But generally of course what you say is true – Niklas B. Jun 24 '14 at 13:02

score 3 · Answer 2 · answered Apr 02 '12 at 00:12

3

The code is not evaluated statically; the code inside the function is still calculated each time. The function object contains all the byte code which expresses the code in the function; it doesn't evaluate any of it. You could improve matters by calculating the expensive value once:

def g(a):
    root_a = math.sqrt(a)
    def f(b):
        return root_a * b
    return f
result = []
a = 100
func = g(a)
for b in range(1000000):
    result.append(func(b))

Naturally, in this trivial example, you could improve performance much more:

a = 100
root_a = math.sqrt(a)
result = [root_a * b for b in range(1000000)]

But I presume you're working with a more complex example than that where that doesn't scale?

answered Apr 02 '12 at 00:12

Chris Morgan

86,207
24
208
215

Of course, you are exactly right, this is just a simplified example. I'm curious - is it a language requirement that it's not optimized away, or does each implementation decide what it wants to do? – max Apr 02 '12 at 00:20
@max: It's implementation-specific and you might get totally different results with something like PyPy or IronPython (not really sure about the latter, but it might profit from the .NET IL optimization). There are some things that are consistent across platforms, though: List comprehensions for example will very likely be much more efficient than building a list yourself. – Niklas B. Apr 02 '12 at 00:23

mgilson · Answer 3 · 2012-04-02T00:27:52.230

1

As usual, the timeit module is your friend. Try some things and see how it goes. If you don't care about writing ugly code, this might help a little as well:

def g(a):
   def f(b,_local_func=math.sqrt):
      return _local_func(a)*b

Apparently python takes a performance penalty whenever it tries to access a "global" variable/function. If you can make that access local, you can shave off a little time.

edited Apr 02 '12 at 00:27

answered Apr 02 '12 at 00:11

mgilson

300,191
65
633
696

No problem. The other answers are better than mine anyway. The only thing that I recommend is to look into using `timeit` if it really is a performance critical portion of code. `timeit` is pretty easy to use so it's definitely worth the small time investment. – mgilson Apr 02 '12 at 00:29

score 0 · Answer 4 · answered Mar 09 '23 at 07:40

An old question and Python has moved forward quite a bit. For the above purpose it now provides functools.partial():

import math

def f(a, b):
    return math.sqrt(a) * b

a = 100
g = functools.partial(f, a)  # functools.partial(<function f at 0x103ccc860>, 100)

f(100, 3)  # 30.0
g(3)  # 30.0, with `a` bound by partial to 100

Function closure performance

4 Answers4