Why is a deep copy so much slower than a shallow copy for lists of the same size?

Question

I've been working on a performance critical application which requires frequently requires making copies of a 2D list of integers and modifying the copy (I'm implementing the minimax algorithm).

I've noticed there is a huge difference in performance between a copy, and a deepcopy on lists with the same number of elements, and I'd like to understand if my thinking is correct.

To reproduce my problem, run the following code:

import numpy as np

np.random.seed(0)
lst1 = np.random.randint(100, size=1000 * 1000).tolist()
lst2 = np.random.randint(100, size=(1000, 1000)).tolist()

Now, timing the statements below, you should see timings similar to mine.

%timeit copy.copy(lst1)
%timeit lst1.copy()
%timeit copy.deepcopy(lst2)

5 ms ± 49.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
5.47 ms ± 551 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.61 s ± 112 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Both lst1 and lst2 have a million elements, but reliably copying the former is 200x faster than a nested list with the same number of elements. I thought this would have to do with the fact that making deep copies of nested lists might require some recursive implementation that is slow, so I tried

%timeit copy.deepcopy(lst1)
1.43 s ± 90.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

And the timings still show a massive slowdown. I've checked the docs but not much explanation was offered. However, from the timings, I suspect that deepcopy is copying each int as well, creating new integers. But this seems like a wasteful thing to do.

Am I right in my thinking here? What is deepcopy doing here that list.copy and shallow copy don't?

I've seen deepcopy() is extremely slow but it seems that question is asking for an alternative rather than an explanation (it wasn't clear to me).

It may indeed be wasteful, but that's what `deepcopy` does: it copies *everything*. It doesn't know that you only want to copy the lists. — chepner, Jan 31 '19 at 19:19
It doesn't copy everything (it doesn't copy immutable built-in types), but it checks everything and maintains a cache of all seen objects — juanpa.arrivillaga, Jan 31 '19 at 19:20
Thank you for the comments, folks. Please consider fleshing them out as an answer. It would seem like my only option here is to switch to a 1D implementation and implement some logic to convert 2D indices to 1D indices when accessing list elements in my code. — cs95, Jan 31 '19 at 19:21
Note, here is the source code: https://github.com/python/cpython/blob/master/Lib/copy.py#L128 — juanpa.arrivillaga, Jan 31 '19 at 19:22
If you know exactlywhat your list is shaped like you can probably implement your own naive copy operation, i.e.: `[sub.copy() for sub in nested_list]`. This will be much faster — juanpa.arrivillaga, Jan 31 '19 at 19:23
Implementing your own copier is also frequently safer than deepcopy, due to issues like deepcopy copying *too* deep. If your input is "list of lists of stuff" and you want "new list of new lists of the same stuff", deepcopy is going to copy the "stuff". — user2357112, Jan 31 '19 at 19:31

user2357112 · Accepted Answer · 2019-01-31T19:40:38.430

deepcopy isn't copying the ints. There's no way it could do that anyway.

deepcopy is slow because it needs to handle the full complexity of a deep copy, even if that turns out to be unnecessary. That includes dispatching to the appropriate copier for every object it finds, even if the copier turns out to basically just be lambda x: x. That includes maintaining a memo dict and keeping track of every object copied, to handle duplicate references to the same objects, even if there are none. That includes special copy handling for data structures like lists and dicts, so it doesn't go into an infinite recursion when trying to copy a data structure with recursive references.

All of that has to be done no matter whether it pays off. All of it is expensive.

Also, deepcopy is pure-Python. That doesn't help. Comparing deepcopy to pickle.loads(pickle.dumps(whatever)), which performs a very similar job, pickle wins handily due to the C implementation. (On Python 2, replace pickle with cPickle.) pickle still loses hard to an implementation that takes advantage of the known structure of the input, though:

In [15]: x = [[0]*1000 for i in range(1000)]

In [16]: %timeit copy.deepcopy(x)
1.05 s ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [17]: %timeit pickle.loads(pickle.dumps(x))
78 ms ± 4.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [18]: %timeit [l[:] for l in x]
4.56 ms ± 108 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

score 0 · Answer 2 · answered Jan 31 '19 at 19:32

In programming, deep copy is equivalent to a physical copy of something. It is an actual copy of the original object. In most programming tools, you can play around with it, modify it without affecting the original object. However, on the other hand, a shallow copy is a reference to the original object. If you change it, it will affect the original object as well. In short, since the deep copy is the actual copy of the original object, it is heavier that the shallow copy which just points to the original object.

Shallow copy: You can have a picture(s) of your new furniture, and get an idea of what it really looks like. You can easily carry around the picture.

Deep copy: You can go to the furniture shop, and look at the real furniture. You probably can't carry around easily, and you may need some help to take it home.

"However, on the other hand, a shallow copy is a reference to the original object. If you change it, it will affect the original object as well. " This is not true. `a = []; import copy; b = copy.copy(a); a.append('foo'); print('b:',b,'a:',a)`. — juanpa.arrivillaga, Jan 31 '19 at 19:36

score 0 · Answer 3 · answered Jan 31 '19 at 19:55

https://docs.python.org/2/library/copy.html

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original

So effectively, the shallow copy will create a new list and populate it with a reference to every element in the original list. Because every element in the original list is itself a list, it's much faster to just store a reference to this than it is to create a new copy. Deepcopy does some clever stuff in how it copies each element in order to avoid errors. But in essence you don't need to understand that to know why one shallowcopy is faster than deepcopy....

Why is a deep copy so much slower than a shallow copy for lists of the same size?

3 Answers3