0

Hello I could not found the difference between using square brackets for comprehension list versus using list()

Is there a performance/ memory allocation difference ?

( same question for set and dict )

input = [1, 2, 3, 4]

B = [a * 2 for a in input if a > 1]

C = list(a * 2 for a in input if a > 1)

B_set = {str(a) for a in input if a > 1}

C_set = set(str(a) for a in input if a > 1)

B_dict = {str(a):a for a in input if a > 1}

C_dict = dict(str(a):b for a,b in input if a > 1) # NOT LEGAL

Thank you for your help

raphaelauv
  • 670
  • 1
  • 11
  • 22
  • `C_dict = dict((str(a), b) for a,b in input if a > 1)` will work. And only this corrected variant will be a bit slower than dict comprehension. All other listed options are equal. – Olvin Roght Jul 19 '20 at 19:34
  • 1
    well, the `list()` syntax is creating a generator, not a list, and then the `list()` function will turn the generator to a list. – Hagai Wild Jul 19 '20 at 19:34
  • @HagaiWild, of course not. Both expressions with list are equal. – Olvin Roght Jul 19 '20 at 19:35
  • 1
    Yes, there's a performance difference. See [this](https://stackoverflow.com/a/29356931/13956730) post. – pwasoutside Jul 19 '20 at 19:35
  • 1
    @OlvinRoght of course it does. – juanpa.arrivillaga Jul 19 '20 at 19:36
  • @juanpa.arrivillaga, `list()` will return list. List comprehension will return list. – Olvin Roght Jul 19 '20 at 19:37
  • 1
    @OlvinRoght The *result* is the same; the path taken to get there is quite different in each case. – chepner Jul 19 '20 at 19:37
  • @OlvinRoght the *result* is the same, but `list(a * 2 for a in input if a > 1)` first creates a generator, then passes that generator to the list constructor. This will be slower, but the result the same – juanpa.arrivillaga Jul 19 '20 at 19:38
  • @chepner, yes, probably there's minor differences in translating code but in fact there's no difference in produced result and performance. – Olvin Roght Jul 19 '20 at 19:38
  • @OlvinRoght no, *in fact* there is a difference, as explained, and there is a performance difference as well, with the generator expression version taking about 1.25 the time. That magnitude is in line with the additional overhead of iterating over a generator. – juanpa.arrivillaga Jul 19 '20 at 19:42
  • 1
    As an analogy, the list comprehension calls `input.__next__` directly. `list()` has to call the generator's `__next__` method, which in turn calls `input.__next__`. – chepner Jul 19 '20 at 20:02

3 Answers3

3

We can check with the -mtimeit.

$ python -mtimeit "B = [a * 2 for a in list(range(1000)) if a > 1]"
5000 loops, best of 5: 86.7 usec per loop
$ python -mtimeit "B = list(a * 2 for a in list(range(1000)) if a > 1)"
2000 loops, best of 5: 110 usec per loop
$ python -mtimeit "B = list(a * 2 for a in list(range(1000)) if a > 1)"
2000 loops, best of 5: 110 usec per loop
$ python -mtimeit "B = {str(a): a for a in list(range(1000)) if a > 1}"
1000 loops, best of 5: 273 usec per loop
$ python -mtimeit "B = set(str(a) for a in list(range(1000)) if a > 1)"
1000 loops, best of 5: 287 usec per loop

So, as you can see, there is no considerable difference.

With bigger list, we have:

$ python -mtimeit "B = [a * 2 for a in list(range(100000)) if a > 1]"
20 loops, best of 5: 11.1 msec per loop
$ python -mtimeit "B = list(a * 2 for a in list(range(100000)) if a > 1)"
20 loops, best of 5: 14.2 msec per loop

Where we see a 3 msec difference, better for the [] case.

With even bigger number list, we have

$ python -mtimeit "B = [a * 2 for a in list(range(10000000)) if a > 1]"
1 loop, best of 5: 1.21 sec per loop
$ python -mtimeit "B = list(a * 2 for a in list(range(10000000)) if a > 1)"
1 loop, best of 5: 1.49 sec per loop

where we see a 0.28 sec difference, again [] is faster.

Xxxo
  • 1,784
  • 1
  • 15
  • 24
2

You can measure the speed with timeit module.

For example:

from timeit import timeit

lst = [1, 2, 3, 4] * 100

def fn1():
    return [a * 2 for a in lst if a > 1]

def fn2():
    return list(a * 2 for a in lst if a > 1)

t1 = timeit(lambda: fn1(), number=10_000)
t2 = timeit(lambda: fn2(), number=10_000)

print(t1)
print(t2)

Prints (AMD 2400G, Python 3.8):

0.2406109299918171
0.2905043710197788

So list comprehension is faster.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

[] is much faster than list() because [] is literal means python direct compile and create bytecode whereas list() is object which need name resolution, stack allocation, etc. before creating bytecode.

Ashish Karn
  • 1,127
  • 1
  • 9
  • 20
  • 3
    That's not the reason. `list()` is fundamentally different than a list comprehension, although, it will effectively result in the same thing. The performance difference is mainly due to the fact that iterating over a generator is slower than the way a list comprehension builds the list directly. – juanpa.arrivillaga Jul 19 '20 at 19:38