1

I've recently noticed that the built-in list has a list.clear() method. So far, when I wanted to ensure a list is empty, I just create a new list: l = [].

I was curious if it makes a difference, so I measured it:

$ python --version
Python 3.11.0
$ python -m timeit 'a = [1, 2, 3, 4]; a= []'    
5000000 loops, best of 5: 61.5 nsec per loop
$ python -m timeit 'a = [1, 2, 3, 4]; a.clear()'
5000000 loops, best of 5: 57.4 nsec per loop

So creating a new empty list is about 7% slower than using clear() for small lists.

For bigger lists, it seems to be faster to just create a new list:

$ python -m timeit 'a = list(range(10_000)); a = []'   
2000 loops, best of 5: 134 usec per loop
$ python -m timeit 'a = list(range(10_000)); a = []'
2000 loops, best of 5: 132 usec per loop
$ python -m timeit 'a = list(range(10_000)); a = []'
2000 loops, best of 5: 134 usec per loop
$ python -m timeit 'a = list(range(10_000)); a.clear()'
2000 loops, best of 5: 143 usec per loop
$ python -m timeit 'a = list(range(10_000)); a.clear()'
2000 loops, best of 5: 139 usec per loop
$ python -m timeit 'a = list(range(10_000)); a.clear()'
2000 loops, best of 5: 139 usec per loop

why is that the case?

edit: Small clarification: I am aware that both are (in some scenarios) not semantically the same. If you clear a list you could still have two pointers to it. When you create a new list object, there will not be a pointer to it. For this question, I don't care about this potential difference. Just assume there is exactly one reference to that list.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • A quick serch here lead me to [link](https://blog.finxter.com/list-clear-vs-new-list/). I think you may understand it better than me! – KarthiDiamond97 Feb 04 '23 at 19:40

1 Answers1

2

I was curious if it makes a difference, so I measured it:

This is not the right way to do the measurement. The setup code should be separated out, so that it is not included in the timing. In fact, all of these operations are O(1); you see O(N) results because creating the original data is O(N).

So far, when I wanted to ensure a list is empty, I just create a new list: l = [].

list.clear is not equivalent to creating a new list. Consider:

a = [1,2,3]
b = a

Doing a = [] will not cause b to change, because it only rebinds the a name. But doing a.clear() will cause b to change, because both names refer to the same list object, which was mutated.

(Using a = [] is fine when that result is intentional, of course. Similarly, user-defined objects are often best "reset" by re-creating them.)

Instead, a.clear() is equivalent to a[:] = [] (or the same with some other empty sequence). However, this is harder to read. The Zen of Python tells us: "Beautiful is better than ugly"; "Explicit is better than implicit"; "There should be one-- and preferably only one --obvious way to do it".

It's also clearly (sorry) faster:

$ python -m timeit --setup 'a = list(range(10_000))' 'a.clear()'
10000000 loops, best of 5: 34.9 nsec per loop
$ python -m timeit --setup 'a = list(range(10_000))' 'a[:] = []'
5000000 loops, best of 5: 67.1 nsec per loop
$ python -m timeit --setup 'a = list(range(10_000))' 'a[:] = ()'
5000000 loops, best of 5: 51.7 nsec per loop

The "clever" ways require creating temporary objects, both for the iterable and for the slice used to index the list. Either way, it will end up in C code that does relatively simple, O(1) pointer bookkeeping (and a memory deallocation).

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • 3
    Isn't the setup executed only once? Meaning if timeit repeats the execution 1000 times, the first time a non-empty list is cleared and after that only the empty list is cleared? – Martin Thoma Feb 04 '23 at 19:48
  • "It is possible to provide a setup statement that is executed **only once** at the beginning" - [source](https://docs.python.org/3/library/timeit.html#timeit-command-line-interface). But we would need to execute it every single time – Martin Thoma Feb 04 '23 at 19:49
  • Ah, because it's destructive. Hmm. I'm confident that this does not affect the results and that the observed timing should be O(1), but I'll see if I can come up with a better way to write the test. – Karl Knechtel Feb 04 '23 at 19:54
  • I haven't thought of a good way to make the test, and also couldn't find an existing question about that, so I might just ask it myself. – Karl Knechtel Feb 04 '23 at 22:42