0

I having some problems trying to understand generators.

Do these function's execution time differ when doing the same task?

def slow_sum(size):
    x= 0
    for i in range(size):
        for j in range(size):
            x += i + j
    return x

def fast_sum(size):
    return sum( [ (i+j) for j in range(size) for i in range(size)] )

size = 2000
slow_val = slow_sum(size)
fast_val = fast_sum(size)
assert slow_val == fast_val, "Values are not equal"

When profiling both functions on my computer using cProfile I got these result, but I expected them to be similar.

Total Time

  • slow_sum(2000)

    • 0.85 ms
  • fast_sum(2000)

    • 0.05 ms

Original File: https://pastebin.com/fDfaSqyZ

My Output: https://pastebin.com/wyy3v3iy

SeoFernando
  • 61
  • 2
  • 13
  • 1
    That's not a generator, that's a list comprehension. With a generator, it would be `sum((i+j) for j in range(size) for i in range(size))` – iz_ Jan 26 '19 at 21:05
  • Don't be deceived. It is not faster. See this reverse question to yours: https://stackoverflow.com/questions/27905965/python-why-is-list-comprehension-slower-than-for-loop – smac89 Jan 26 '19 at 21:07
  • 1
    @Tomothy32, generators are generally not faster than creating a list first. For example, see `str.join`; try it with a generator vs a list and compare the results – smac89 Jan 26 '19 at 21:08
  • 2
    it's special for `str.join` because it needs a list anyway and if it doesn't have one, it builds one. Not the case for `sum` – Jean-François Fabre Jan 26 '19 at 21:09
  • 1
    `slow_sum` has to execute individual Python add operations; `fast_sum` sums the list in C. – chepner Jan 26 '19 at 21:09
  • slow_sum is faster on my system :) both are computed in around 0.5 second. I'm beginning to doubt about your benchmark – Jean-François Fabre Jan 26 '19 at 21:10
  • 2
    I got almost the same speed: `size = 1000`, 10 times with `timeit`: `fast 1.25320039 - slow 1.1005855780000002` – iGian Jan 26 '19 at 21:11
  • 1
    This is a leading question because now we have to reconcile whether the statement that "generators are faster than for-loops", is true; then we have to come up with an explanation as to why. But what if you are wrong? And as most of the comments have indicated, your initial statement is actually not true; also it is not a generator expression. The question you should be asking is why in THIS case; the generator expression was faster than the for-loop. – smac89 Jan 26 '19 at 21:16
  • 1
    Are you sure you're looking at the right part of the profiler output? You should be looking at cumtime or the percall column to the right of cumtime. tottime ignores time spent in `sum` or in the list comprehension's stack frame. – user2357112 Jan 26 '19 at 21:30
  • 1
    Okay, yep, you're looking at tottime. – user2357112 Jan 26 '19 at 21:31
  • Oh welp, my bad :/ @user2357112 – SeoFernando Jan 26 '19 at 21:33
  • Generators save space. Space is sometimes more important than time complexity in terms of actual resources, and therefore real time. I am addressing the hypothetical, and ignoring the details. – Kenny Ostrom Jan 26 '19 at 22:14

1 Answers1

2

You're looking at the wrong column of the profiler output. tottime doesn't count all the time fast_sum spends inside the sum call or the list comprehension's stack frame. You should be looking at cumtime, which is near equal for the two functions.

user2357112
  • 260,549
  • 28
  • 431
  • 505